
POLYKETIDES AND THEIR SYNTHESIS 



The present invention relates to processes and 
materials (including enzyme systems, nucleic acids, 
5 vectors and cultures) for preparing polyketides, 

particularly polyethers but including polyenes, 
macrolides and other polyketides by recombinant 
synthesis, and to the polyketides so produced, 
particularly novel polyketides. {N.B the term 

10 "polyketide" is being used in its conventional sense to 

include structures notionally derived by the reduction 
and/or other processing or modification of one or more 
Ketide units) . Furthermore the invention provides the 
entire nucleic acid sequence of the biosynthetic gene 

15 cluster that governs the production of the ionophoric 

antibiotic polyether polyketide monensin in Streptomyces 
cinnamonensis, and the use of all or part of the cloned 
DNA first, in the specific detection of other polyether 
biosynthetic gene clusters; secondly in the engineering 

20 of mutant strains of S. cinnamonensis and of other 

actinomycetes which are suitabl'§*'host strains for the 
high level production of novel recombinant polyketides; 
and thirdly in the provision of recombinant biosynthetic 
genes which lead to such novel polyketide products. 

25 Polyketides are a large and structurally diverse 
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class of natural products that includes many compounds 
possessing antibiotic or other pharmacological 
properties, such as erythromycin, tetracyclines, 
rapamycin, avermectin, monensin, epothilones and FK506. 
5 In particular, polyketides are abundantly produced by 

Streptomyces and related actinomycete bacteria. They are 
synthesised by the repeated stepwise condensation of 
acylthioesters in a manner analogous to that of fatty 
acid biosynthesis. The greater structural diversity found 

10 among natural polyketides arises from the selection of 

(usually) acetate or propionate as ^^starter" or 
'^extender" units; and from the differing degree of 
processing of the 3-keto group observed after each 
condensation. Examples of processing steps include 

15 reduction to p-hydroxyacyl-, reduction followed by 

dehydration to 2-enoyl-, and complete reduction to the 
saturated acylthioester . The stereochemical outcome of 
these processing steps is also specified for each cycle 
of chain extension. In addition, the biosynthetic 

20 pathways to many polyketides involve additional enzyme- 

catalysed modifications which mst^ include: methylation by 
O- and C-methyltransf erases, hydroxylation by cytochrome 
P450 enzymes, other oxidation or reduction processes, and 
the biosynthesis and attachment of novel sugars and/or 

25 deoxy sugars. 
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The biosynthesis of polyketides is initiated by a 
group of chain-forming enzymes known as polyketide 
synthases. Two classes of polyketide synthase (PKS) have 
been described in actinomycetes . One class, named Type I 
5 PKSs, represented by the PKSs for the macrolides 

erythromycin, oleandomycin, avermectin and rapamycin, 
consists of a different set or """module" of enzymes for 
each cycle of polyketide chain extension. (For examples 
see Cortes, J. et al. Nature (1990) 348:176-178; Donadio, 

10 S. et ai. Science (1991) 252:675-679; Swan, D.G. et al. 

Mol. Gen. Genet. (1994) 242:358-362; MacNeil, D.J. et al. 
Gene (1992) 115:119-125; Schwecke, T. et aJ. Proc. Natl, 
Acad. Sci. USA (1995) 92:7839-7843.) 

The term "extension module" as used herein refers to 

15 the set of contiguous domains, from a p-ketoacyl-ACP 

synthase {"^KS") domain to the next acyl carrier protein 
(^"ACP") domain, which accomplishes one cycle of 
polyketide chain extension. The term "loading module" is 
used to refer to any group of contiguous domains which 

20 accomplishes the loading of the starter unit onto the PKS 

and thus renders it available ta^he KS domain of the 
first extension module. The length of polyketide formed 
has been altered, in the case of erythromycin 
biosynthesis, by specific relocation using genetic 

25 engineering of the enzymatic domain of the erythromycin- 
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producing PKS that contains the chain releasing 
thioesterase/cyclase activity (Cortes J. et al. Science 

(1995) 268:1487-1489; Kao, CM. et al. J. Am. Chem. Soc. 

(1995) 117:9105-9106) . 
5 In-frame deletion of the DNA encoding part of the 

ketoreductase domain in module 5 of the erythromycin- 
producing PKS (also known as 6-deoxyerythronolide B 
synthase, DEBS) has been shown to lead to the formation 
of erythromycin analogues 5, 6-dideoxy-3-a-mycarosyl-5- 
10 oxoerythronolide B, 5, 6-dideoxy-5-oxoerythronolide B and 

5. 6- dideoxy, 6-(5-epoxy-5-oxoerythronolide B (Donadio, S. 
et al. Science (1991) 252:675-679). Likewise, alteration 
of active site residues in the enoylreductase domain of 
module 4 in DEBS, by genetic engineering of the 

15 corresponding PKS-encoding DNA and its introduction into 

Saccharopolyspora erytbraea, led to the production of 

6. 7- anhydroerythromycin C (Donadio, S. et al. Proc. Natl. 
Acad. Sci, USA (1993) 90:7119-7123). 

International Patent Application number WO 93/13663 
20 describes additional types of genetic manipulation of the 

DEBS genes that are capable of producing altered 
polyketides. However many such attempts are reported to 
have been unproductive (Hutchinson, C.R. and Fujii, I, 
Annu. Rev, Microbiol. (1995) 49:201-238, at p. 231). The 
25 complete DNA sequence of the genes from Streptomyces 
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hygroscopicus that encode the modular Type I PKS 
governing the biosynthesis of the macrocyclic 
immunosuppressant polyketide rapamycin has been disclosed 
(Schwecke, T. et al. (1995) Proc- Natl, Acad. Sci. USA 
5 92:7839-7843). The DNA sequence is deposited in the 

EMBL/Genbank Database under the accession number X86780. 

WO 98/01546 discloses that a PKS gene assembly 
(particularly of Type I) encodes a loading module which 
is followed by at least one extension module. The first 

10 open reading frame encodes the first multi-enzyme or 

cassette (DEBSl) which consists of three modules: the 
loading module (ery-load) and two extension modules 
(modules 1 and 2) . The loading module comprises an 
acyltransf erase and an acyl carrier protein. This may be 

15 contrasted with Figure 1 of WO 93/13663 {referred to 

above) . This shows ORFl as only two modules, the first of 
which is in fact both the loading module and the first 
extension module. 

WO 98/01546 describes in general terms the 

20 production of a hybrid PKS gene assembly comprising a 

loading module and at least one-extension module. It also 
describes (see also Marsden, A.F.A. et al. Science (1998) 
279:199-202) construction of a hybrid PKS gene assembly 
by grafting the wide-specificity loading module for the 

25 avermectin-producing polyketide synthase onto the first 




multi-enzyme component (DEBSl) for the erythromycin PKS 
in place of the normal loading module. Certain novel 
polyketides can be prepared using the hybrid PKS gene 
assembly, as described for example in WO 98/01571. 
5 WO 98/01546 further describes the construction of a 

hybrid PKS gene assembly by grafting the loading module 
for the rapamycin-producing polyketide synthase onto the 
first multi-enzyme component (DEBSl) for the erythromycin 
PKS in place of the normal loading module. The loading 

10 module of the rapamycin PKS differs from the loading 

modules of DEBS and the avermectin PKS in that it 
comprises a CoA ligase domain, an enoylreductase C^ER") 
domain and an AGP, so that suitable organic acids 
including the natural starter unit 3,4- 

15 dihydroxycyclohexane carboxylic acid may be activated in 

situ on the PKS loading domain and, with or without 
reduction by the ER domain, transferred to the ACP for 
intramolecular loading of the KS of extension module 1 
(Schwecke, T. et al. Proc. Natl. Acad- Sci. USA (1995) 

20 92:7839-7843). WO 98/51695 and WO 98/49315 describe 

additional types of genetic mani^^ulation of the DEBS 
genes that are capable of producing altered polyketides. 

The second class of PKS, named Type II PKSs, is 
represented by the synthases for aromatic compounds. Type 

25 II PKSs contain only a single set of enzymatic activities 
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for chain extension and these are re-used as appropriate 
in successive cycles (Bibb, M.J. et al. EMBO J, (1989) 
8:2727-2736; Sherman, D.H. et al. EMBO J. (1989) 8:2717- 
2725; Fernandez-Moreno, M.A. et al. J. Biol. Chem. (1992) 
5 267:19278-19290). The '"extender" units for the Type II 

PKSs are usually acetate units, and the presence of 
specific cyclases dictates the preferred pathway for 
cyclisation of the completed chain into an aromatic 
product (Hutchinson, C.R. and Fujii, I. Ann. Rev. 

10 Microbiol. (1995) 49:201-238). Hybrid polyketides have 

been obtained by the introduction of cloned Type II PKS 
gene-containing DNA into another strain containing a 
different Type II PKS gene cluster, for example by 
introduction of DNA derived from the gene cluster for 

15 actinorhodin, a blue-pigmented polyketide from 

Streptomyces coelicolor, into an anthraquinone 
polyketide-producing strain of StreptomycBS galileus 
(Bartel, P.L. et al. J. Bacterid. (1990) 172:4816-4826). 
The minimal number of domains required for 

20 polyketide chain extension on a Type II PKS when 

expressed in a Streptomyces coeT^color host cell (the 
'^minimal PKS") has been defined for example in WO 
95/08548 as containing the following three polypeptides 
which are products of the acti genes: firstly KS; 

25 secondly a polypeptide termed the CLF with end-to-end 
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amino acid sequence similarity to the KS but in which the 
essential active site residue of the KS, namely a 
cysteine residue, is substituted either by a glutamine 
residue or, in the case of the PKS for a spore pigment 
5 such as the whiE gene product (Davis, N.K. and Chater, 

K.F. Mol- Microbiol. (1990) 4:1679-1691) by a glutamic 
acid residue; and finally an ACP. The CLF has been stated 
(for example in WO 95/08548) to be a factor that 
determines the chain length of the polyketide chain that 

10 is produced by the minimal PKS. However it has been found 

(Shen, B. et al- J. Am. Chem. Soc. (1995) 117:6811-6821) 
that when the CLF for the octaketide actinorhodin is used 
to replace the CLF for the decaketide tetracenomycin in 
host cells of Streptomyces glaucescens, the polyketide 

15 product is not found to be altered from a decaketide to 

an octaketide, so the exact role of the CLF remains 
unclear. An alternative nomenclature has been proposed in 
which KS is designated KSa and CLF is designated KSp, to 
reflect this lack of knowledge (Meurer, G. et al. 

20 Chemistry & Biology (1997) 4:433-443). The mechanism by 

which acetate starter units and'^cetate extender units 
are loaded onto the Type II PKS is not known, but it is 
speculated that the malonyl-CoA: ACP acyltransf erase of 
the fatty acid synthase of the host cell can fulfil the 

25 same function for the Type II PKS (Revill, W.P. et ai. J. 
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Bacteriol. (1995) 177:394 6-3952) . 

WO 95/08548 describes the replacement of 
actinorhodin PKS genes by heterologous DNA from other 
Type II PKS gene clusters, to obtain hybrid polyketides. 
It also describes the construction of a strain of 
Streptomyces coelicolor which substantially lacks the 
native gene cluster for actinorhodin, and the use in that 
strain of a plasmid vector pRM5 derived from the low-copy 
number vector SCP2* isolated from Streptomyces coelicolor 
(Bibb, M.J, and Hopwood, D.A. J. Gen. Microbiol. (1981) 
126:427-442) and in which heterologous PKS-encoding DNA 
may be expressed under the control of the divergent actl/ 
actlll promoter region of the actinorhodin gene cluster 
(Fernandez-Moreno, M.A. et ai. J, Biol. Chem. (1992) 
267:19278-19290). The plasmid pRM5 also contains DNA from 
the actinorhodin biosynthetic gene cluster encoding the 
gene for a specific activator protein, ActII-orf4. The 
ActII-orf4 protein is required for transcription of the 
genes placed under the control of the actl/actlll 
bidirectional promoter and activates gene expression 
during the transition from grow'CTt to stationary phase in 
the vegetative mycelium (Hallam, S.E. et al. Gene (1988) 
74:305-320) . 

Type II clusters in Streptomyces are known to be 
activated by pathway-specific activator genes (Narva, 




K,E. and Feitelson, J.S. J. Bacteriol. (1990) 172:326- 
333; Stutzman-Engwall, K.J. et al. J. Bacteriol. (1992) 
174:144-154; Fernandez-Moreno, M.A. et al. Cell (1991) 
66:769-780; Takano, E. et al. Mol. Microbiol. (1992) 
5 6:2797-2804; Gramajo, H.C, et ai. Mol. Microbiol. (1993) 

7:837-845). The DnrI gene product complements a mutation 
in the actJI-orf4 gene of S. coelicolor, implying that 
DnrI and ActII-orf4 proteins act on similar targets. A 
gene (srmR) has been described (EP 0 524 832 A2) that is 
10 located near the Type I PKS gene cluster for the 

macrolide polyketide spiramycin. This gene specifically 
activates the production of the macrolide antibiotic 
spiramycin, but no other examples have been found of such 
a gene. Also, no homologues of the Actll-orf 4/DnrI/RedD 
15 family of activators have been described that act on Type 

I PKS genes, WO 98/01546 describes the use of the Actll- 
orf4 family of activators in conjunction with their 
cognate promoters (e.g actII-orf4 with the acti promoter) 
in a heterologous actinomycete to obtain high level 
20 expression of recombinant Type I polyketide synthase 

genes. 

Although large numbers of therapeutically important 
polyketides have been identified, there remains a need to 
obtain novel polyketides that have enhanced properties or 
25 possess completely novel bioactivity. The complex 
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polyketides produced by Type I PKSs are particularly 
valuable, in that they include compounds with known 
utility as anthelminthics, insecticides, 
immunosuppressants, antifungal agents or antibacterial 
agents. Because of their structural complexity, such 
novel polyketides are not readily obtainable by total 
chemical synthesis, nor by chemical modifications of 
known polyketides. 

There is also a need to develop reliable and 
specific ways of deploying individual genes and portions 
of genes in practice so that all, or a large fraction, of 
hybrid PKS genes that are constructed, are viable and 
produce the desired polyketide product. This includes the 
development of advantageous host strains for expression 
of such genes. For example many polyketides are rendered 
bioactive by the action of further enzymes other than the 
polyketide synthase, and host strains that contain and 
are able to express the genes for such enzymes are 
particularly convenient for the efficient synthesis of 
the bioactive material. In those cases where the 
construction of a known or a no^l polyketide requires 
specialised precursors, host strains containing and able 
to express the genes for key enzymes that enhance the 
production of such specialised precursors are equally 
valuable and desirable. There is also a need to develop 
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rational methods of increasing the expression level of 
all the genes required for production of a specific 
polyketide. Clearly also a host cell which is 
advantageous for the above reasons, and/or because of 
5 other favourable characteristics including but not 

limited to its speed of growth, excellent handling 
characteristics in fermentation, and ease of 
transformation with DNA by various techniques, can be 
made even more favourable by the cloning into that cell 
10 of such auxiliary genes for polyketide modification, or 

gene activation, or post-translational modification, or 
precursor supply. 



The DNA sequences have been disclosed for several 
15 Type I PKS gene clusters that govern the production of 

16-membered macrolide polyketides, including the tylosin 
PKS from Streptomyces fradiae (application EP 0 791 655 
A2), the niddamycin PKS from Streptomyces caelestis 
(Kavakas, S.J. at al. J. Bacterid. (1997) 179:7515-7522) 
20 and the spiramycin PKS from Streptomyces ambofaciens 

(application EP 0791 655 A2) . DW^ sequences have also 
been disclosed for Type I PKS gene clusters that govern 
the production of further complex polyketides, for 
example rifamycin from Amycolatopsis mediterranei (WO 
25 98/07868), and soraphen from Sorangium cellulosum (US 
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5716849) , but so far no DNA sequence has been disclosed 
for one of the most widespread and important classes of 
complex polyketides, the polyethers. 

Polyethers form an important group of complex 
5 polyketide antibiotics (Westley, J-W. in "Antibiotics IV. 

Biosynthesis" (Corcoran, J.W, Ed.)/ Springer-Verlag, New 
York (1981) p. 41-73) . They are polyoxygenated carboxylic 
acids which act as selective ionophores transporting 
cations across the cell membrane of target cells and 

10 thereby causing depolarisation and cell death. Certain 

polyethers including monensin, lasalocid and tetronasin 
are in widespread use in animal husbandry as 
coccidiostats (principally targetted against Eimeria 
spp.) and as growth promoters. Polyethers have also been 

15 reported to be active in vitro and in vivo against the 

malarial parasite Plasmodium falciparum (Gumila, C. et 
al. Antimicrobial Agents and Chemotherapy (1997) 41: 523- 
529) . 

Polyethers contain multiple asymmetric centres and 
20 are characterised by the presence of tetrahydrof uran and 

tetrahydropyran rings, producing**a characteristic shape 
which is non-polar on its outer surface and therefore 
well adapted for transport of material across bacterial 
membranes; and provides on its inner surface polar 
25 coordinating ligands for a centrally-bound metal ion. In 
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addition to tetrahydrof uran and tetrahydropyran rings, 
other groups which are often present include spiroketal, 
dispiroketal, and substituted benzoic acid moieties and 
occasionally other groups for example a tetronic acid or 
5 a 6-membered carbocyclic ring 

Monensins A and B are produced by the actinomycete 
Streptomyces cinnamonensis , Their structures are shown in 
Figure 1. Monensin B differs from monensin A only in the 
presence of a methyl sidechain at C-16 rather than an 
10 ethyl sidechain. Monensin selectively binds and 

transports sodium ions. In addition to its antibacterial 
and antifungal properties monensin has some activity 
against protozoal parasites such as the malarial parasite 
Plasmodium falciparum. Although the structures of 
15 polyethers differ significantly from those of other 

complex polyketides such as the polyhydroxylated and 
polyene macrolides, their biosynthesis appears to take 
place by a metabolic pathway which has many common 
elements. Thus experiments using carbon 14-labelled 
20 precursors have shown that monensin A is synthesised from 

five acetate, one butyrate and -a^ven propionate units 
(Day, L.E. et al. Antimicrob. Agents Chemother. (1973) 
4:410-414). Similarly experiments using precursors 
doubly-labelled with carbon-13 and oxygen-18 have shown 
25 that oxygens (0)1, (0)3, (0)4, (0)5, (0)6 and (0)10 of 
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monensin arise from the carboxylate oxygens of either 
propionate or acetate, while growth in the presence of 
oxygen-18 oxygen gas demonstrated that the three 
remaining ether oxygens (0)7, (0)8 and (0)9 are derived 
5 from molecular oxygen (Cane, D.E. et al., J- Am. Chem. 

Soc. (1981) 103:5962-5965; Cane, D.E, et al. J. Am, Chem. 
Soc. (1982) 104:7274 - 7281; Ajaz, A. A. and Robinson, 
J. A. J. Chem, Soc. Chem. Commun. (1983) 12:679-680). 
These findings have been rationalised by proposing that 

10 the biosynthesis of monensin proceeds via an acyclic 

triene intermediate (1) in which the geometry of all 
three carbon-carbon double bonds is E (entgegen) rather 
than Z (zusammen) . The triene is then proposed to be 
subject to epoxidation to a tri-epoxide (2) and then ring 

15 opening is proposed to occur with concomitant sequential 

formation of the five ether rings as shown in Figure 2A, 
Such a biosynthetic pathway, first mooted by Westley in 
1974 (Westley J.W. et al. , J. Antibiot. (1974) 27:597- 
604) accounts for the observed stereochemistry at the 

20 multiple asymmetric centres in monensin, (Cane, D.E. et 

al. J. Am. Chem. Soc, (1982) 10gT7274-7281 ; Sood, G.R. et 
al- J. Chem. Soc. Chem. Commun. (1984) 21:1421-1424) and 
analogous schemes can be used to account for the 
biosynthesis of other known polyethers. such as lasalocid 

25 A (Hutchinson C.R. et ai., J. Am. Chem, Soc. (1981) 
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103:5953-5956), tetronasin (ICI 139603) ( Demetriadou, 
A.K. et al. J. Chem. Soc. Chem. Commun. (1985) 7:408-410) 
and narasin (Spavold, Z. et al. Tetrahedron Letters 
(1986) 27:3299-3302). The hydroxylation at C-26 and the 
5 introduction of an O-methyl group on oxygen 3 -are 

proposed to occur as late steps in the biosynthesis, 
after formation of the polyether structure. 

Unfortunately key aspects of the biosynthetic scheme 
shown in Figure 2A have so far eluded experimental 

10 confirmation. No biosynthetic intermediates have been 

isolated from mutants of S. cinnamonensis that are 
blocked in early stages of monensin production. 26- 
deoxymonensin A has been isolated from a S. cinnamonensis 
mutant partially blocked in monensin production 

15 (Ashworth, D.M. et al. J. Antibiot. (1989) 42:1088-1099) 

and 3-0-demethylmonensins A and B have been recovered as 
minor components from the fermentation broth of a 
monensin-producing strain (Pospisil, S. et al. J. 
Antibiot. (1987) 40:555-557). When fed to cells of S. 

20 cinnamonensis in radio-labelled form, neither 

2 6-deoxymonensin A, nor 3-0-deme^ylmonensin A, nor 3-0- 
demethyl, 2 6-deoxymonensin A were significantly 
incorporated into monensin A (Ashworth, D.M. et al, J. 
Antibiot. (1989) 42:1088-1099), either because they are 

25 actively excluded or because these modifications in fact 
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occur earlier in the biosynthetic pathway so that these 
metabolites are shunt products not readily converted into 
the final antibiotic by the respective hydroxylase or 
methyltransf erase. Similarly, the putative all (E)-triene 
5 precursor (1) has been synthesised and shown not to 

become incorporated into monensin when fed to growing 
cells of S. cinnamonensis (Holmes, D.S. at al. Helv. 
Chim. Acta (1990) 73:239-259). An alternative pathway has 
been proposed, as shown in Fig 2B, based on the 

10 transition-metal-mediated oxidation of 1,5-dienes (Walba, 

D.M. and Edwards, P.D. Tetrahedron Lett. (1980) 21:3531- 
3534) . The triene intermediate (4) would different from 
that of Figure 2A (1) only in that each carbon-carbon 
double bond would have the ( Z ) -configuration (Townsend, 

15 C.A, and Basak, A. Tetrahedron (1991) 47:2591-2602) and 

not the (E)- configuration. 

The genetic basis of secondary metabolite 
biosynthesis essentially exists in the genes which code 
for the individual biosynthetic enzymes and in the 

20 regulatory elements which control the expression of the 

biosynthetic genes. The genes eiwoding biosynthesis of 
polyketides in actinomycetes have hitherto been found as 
clusters of adjacent genes, ranging in size from 
20 kilobasepairs (kbp) to over 100 kbp. The clusters 

25 often contain specific regulatory genes and genes 
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conferring resistance of the producing strain to its own 
antibiotic . 

In various of its aspects the invention provides the 
following : - 

5 (1) a DNA sequence encoding at least one-peptide 

necessary for the biosynthesis of monensin, preferably 
comprising one or more of the following genes: mon BI , 
mon Bllr mon CI ^ mon CII^ mon H, mon RI ^ mon RII^ mon 
mon AIX and inon AX as depicted in the appended sequence 
10 data or an allele or mutation thereof; 

(2) a DNA sequence according to the first aspect 
comprising all of the genes listed therein or an allele 
or mutation thereof; 

(3) a DNA sequence according to the first aspect 
15 comprising the complete monensin gene cluster; 

(4) a DNA sequence coding for one or more of the 
peptides set out below, said peptide having the amino 
acid sequence as set out in the appended sequence data or 
being a variant thereof having the specified activity: 

20 peptide activity 

mon CII epoxyhydrolase/cyclase^ 

mon E S-adenosylmethionine-dependent methyltransf erase 

mon T monensin resistance gene 

mon RII repressor protein 
25 mon AIX thioesterase 
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mon 


AI 


polyketide 


synthase 


multienzyme 


mon 


All 


polyketide 


synthase 


multienzyme 


mon 


AIII 


polyketide 


synthase 


multienzyme 


mon 


AIV 


polyketide 


synthase 


multienzyme 


mon 


AVI 


polyketide 


synthase 


multienzyme 


mon 


AVII 


polyketide 


synthase 


multienzyme 


mon 


AVIII 


polyketide 


synthase 


multienzyme 


mon 


H 


regulatory 


protein 




mon 


CI 


flavin-dependent epoxidase 


mon 


BII 


carbon-carbon double bond isomerase 


mon 


BI 


carbon-carbon double bond isomerase 


mon 


D 


cytochrome 


P450 hydroxylase 


mon 


RI 


activator protein 




mon 


AX 


thioesterase 





15 

(5) a recombinant cloning or expression vector 
comprising a DNA sequence according to any of aspects 1-4; 

(6) a transformant host cell which has been 
transformed to contain a DNA sequence according to any of 

20 aspects 1-4 and is capable of expressing a corresponding 

peptide; 

(7) a hybridization probe comprising a polynucleotide 
which binds specifically to a region of the monensin gene 
cluster selected from mon BI^ mon BII ^ mon CI, mon CII, 

25 mon mon RI , mon RII, mon mon AIX and mon AX; 
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(8) use of a probe according to aspect (7) in a 
method of detecting the presence of a gene cluster which 
governs the synthesis of a polyether, and optionally 
isolating a gene cluster detected thereby; 
5 (9) Use of a probe comprising a polynucleotide which 

binds specifically to a gene responsible for levels of 
activity of the monensin gene cluster, preferably a 
regulatory gene, resistance gene or thioesterase gene, 
more preferably the regulatory gene jnon RI, in a method of 
10 detecting an analogous gene in a gene cluster of another 

polyketide, preferably a polyether, and optionally 
manipulating the gene detected thereby to alter the level 
of expression of said other polyketide; 

(10) a host cell, preferably Streptomyces 

15 cinnamonensiSf containing a heterologous gene under the 

control of the mon RI gene and a monensin promoter; 

(11) use of a portion of the monensin gene cluster 
having chain terminating activity, preferably comprising 
at least one of mon AIX and mon AX or a mutant or allele 

20 thereof having chain terminating activity, to effect chain 

release of a peptide other than-^rme required for monensin 
biosynthesis ; 

(12) use of a portion of the monensin gene cluster 
having carbon-carbon double bond isomerase activity, 

25 preferably comprising at least one of 2non BI and mon BII 
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or a mutant or allele thereof having isomerase activity to 
provide a desired stereochemical outcome in the synthesis 
of a polyketide other than monensin; 

(13) a polypeptide encoded by a portion of the 

5 monensin gene cluster, preferably comprising at least one 

of mon BI and jnon BII or a mutant or allele thereof, 
having carbon-carbon double bond isomerase activity; 

(14) an epoxidase enzyme encoded by mon CI or a 
derivative or variant thereof having epoxidase activity; 

10 (15) a cyclase enzyme encoded by mon CII or a 

derivative or variant thereof having cyclase activity. 

Some embodiments of the invention will now be 
described by way of example with reference to the 
accompanying drawings in which: 
15 Fig 1 shows the structure of monensins A and B; 

Fig 2 illustrates proposed biosynthetic pathways; 
Fig 3 illustrates the proposed organization of the 
monensin polyketide synthase (PKS) enzyme complex; and 
Fig 4 illustrates the proposed organization of the 
20 monensin biosynthetic gene cluster. 

The overall gene organization of the monensin 
biosynthetic gene cluster, as shown in Fig 4, is similar 
to that previously found for many macrolide biosynthetic 
gene clusters, which have one or more open reading frames 
25 (ORFs) encoding large multifunctional PKSs flanked by 
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other genes which encode functions required for the 
biosynthesis of the antibiotic. In the case of monensin, 
there is an unusually high number of distinct ORFs 
encoding PKS multi-enzymes (eight in total, labelled monAI 
5 to laonAVIII) but there is again a separate module of 

enzymes for each cycle of polyketide chain extension, 
exactly as found for modular PKSs for macrolide 
biosynthesis (see Fig 3) . Thus there are 12 condensations 
predicted to be required for the production of the carbon 

10 skeleton of monensin, and in agreement with this there are 

found to be 12 extension modules of PKS enzymes 
distributed among the 8 PKS ORFs. However, as mentioned in 
detail below, the other genes in the monensin cluster 
include genes which have not previously been found in any 

15 other gene cluster for the biosynthesis of a complex 

polyketide, and which are not significantly similar to any 
genes in published sequence databases- The cloned DNA for 
these genes is useful to allow the diagnosis that a 
polyketide biosynthetic gene cluster in any actinomycete, 

20 uncovered previously by conventional hybridization against 

a PKS gene probe from (say) the-^EBS or some other 
characterised PKS gene cluster, is one that governs the 
synthesis of a polyether; and these genes are also 
valuable either singly or in combination as specific 

25 hybridization probes for the specific detection and 
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isolation of additional polyether biosynthetic gene 
clusters. Examples of these previously-unknown genes are 
the genes monBIr monBII, monCI and monCII. In addition the 
regulatory genes monH laonRI, and laonRII and the resistance 
5 gene monT and the thioesterase genes monAIX and monAX are 

all useful for the detection of analogous genes in other 
polyether clusters which are required for the rational 
manipulation of such genes in order to increase levels of 
the specific product. 

10 The cloned and sequenced cluster of genes for 

monensin biosynthesis is useful secondly in the 
engineering of mutant strains of 5. cinnamonensis and of 
other actinomycetes which are suitable strains for the 
high level production of either natural or novel 

15 recombinant polyketides. The sequence of the monensin 

cluster disclosed here shows the surprising fact, that the 
gene cluster contains a gene monRI whose gene product has 
an amino acid sequence highly similar to that of actll- 
orf4, the pathway-specific activator gene which activates 

20 the acti and other promoters of the actinorhodin 

biosynthetic gene cluster of Streptomyces coelicolor. The 
recognition of this aspect of the natural regulation of a 
Type I PKS cluster is important and valuable because 
first, it is possible to increase the yield of monensin by 

25 increasing the level of the activator MonRI, either by 
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placing the gene monRI under the control of a powerful 
promoter or arranging for the presence within the cells of 
one or more additional copies of the monRI gene (as 
exemplified below) ; secondly, it will be possible to use 
the monRI gene as a specific hybridisation pr~bbe to locate 
similar genes in other complex PKS gene clusters, 
especially other polyether PKS gene clusters but also 
polyene and macrolide gene clusters and all other Type I 
modular PKS gene clusters; even in cases where (as for 
rapamycin and erythromycin) no such gene has been 
previously found within the currently accepted physical 
limits of the relevant biosynthetic gene cluster. In such 
cases the monRI gene probe might be expected to uncover 
the activator even if it resides on the chromosome at some 
distance from the main body of the gene cluster; and 
simple experiments would then show whether the 
activator (s) so uncovered are involved in regulation of 
the biosynthesis of those particular metabolites; thirdly, 
increasing the copy number of the monRI gene or of any of 
the activator genes uncovered will tend to increase the 
yield of a heterologous polyketiCfe by ^'crosstalk" where 
the activator mimics the presence of the normal activator 
for the transcription of the genes for that heterologous 
polyketide synthase. It is clear from recently published 
work (Wietzorrek, A, and Bibb, M. Mol , Microbiol. (1997) 




25:1181-1184) that the ActII-orf4 family of activators 
exert their effects by binding to promoter regions within 
the target gene cluster, so it will be possible to use the 
monRI gene together with monensin promoter regions to 
5 drive the high-level transcription and transTation of 

heterologous genes in Streptomyces cinnamonensis , and 
perhaps in other host strains too; such genes need not be 
PKS genes or even involved in polyketide biosynthesis. 
Monensin promoter regions are found at the 5' end of genes 

10 or groups of genes in the cluster and their location is 

clear from the sequence analysis disclosed here. Thus a 
useful vector would provide the monensin promoter and the 
ribosome binding site and continue up to the start of the 
open reading frame, after which the monensin ORF naturally 

15 found there would be replaced by the heterologous gene. 

The relative strength of the monensin promoters can be 
readily determined using any one of a number of known 
promoter probes, i.e. genes whose expression gives rise to 
readily measurable and quantifiable effects, such as Green 

20 Fluorescent Protein (GFP) ; or beta-galactosidase in the 

presence of a chromogenic substr^e. It should be possible 
to mutate randomly the small region of the monensin 
promoters especially likely to interact with the MonRI 
activator (identified by the presence of tandem 

25 heptanucleotide repeats with a common consensus sequence 
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between the various monensin promoters) (Wietzorrek, A. 
and Bibb, M. Mol. Microbiol, (1997) 25:1181-1184), and to 
determine the optimal DNA sequence for the maximal 
activation effect using either S. cinnamonensis 
5 (preferably - in case there are other unknown" factors that 

make the activation function better in this strain than in 
other heterologous systems), or even in another host 
actinomycete strain. If the natural monensin promoters 
were mutated to have this optimal recognition sequence, 

10 then this would further increase the production of 

monensin. By extension, the use of this modified monensin 
promoter in conjunction with the monRI gene in 
heterologous systems could form the basis of further 
improvements in expression of polyketide synthases or 

15 other genes, either by appropriate chromosomal alterations 

to introduce the altered promoter and also the monRI gene; 
or by provision of vectors containing these optimised 
signals linked to specific genes and housed in suitable 
host cells. 

20 The sequencing of the monensin cluster has uncovered 

another strategy for gene regulaT'ion in such Type I 
clusters. The previously-sequenced genes for the rapamycin 
biosynthetic pathway in Streptomyces hygroscopicus 
included a gene of unknown function {rapH) . A closely 

25 similar gene has now been found in the monensin 
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biosynthetic gene cluster {monH) , and it is clear from 
this recurrence (and the comparison of the sequences with 
those of database proteins) that this gene is potentially 
an important DNA-binding sensor gene which acts to 
5 regulate the transcription of the cluster in concert with 

other regulatory signals. Simple experimentation is needed 
in order to define whether the gene is an activator, in 
which case putting in another copy or increasing its 
transcription will have the potential to increase 
10 polyketide biosynthesis; or alternatively the rapH gene 

product may be a negative regulator, whereupon deletion of 
this gene may release the biosynthetic pathway from this 
inhibitory effect and increase yields. 

There is a continuing need to develop new methods of 
15 high-level production of bioactive metabolites and other 

valuable gene products in actinomycetes . StreptomycQs 
cinnamonensis is a recognised and very valuable industrial 
strain for the production of very high levels of monensin, 
it is readily transformable with DNA by standard methods 
20 of conjugation or of protoplast transformation, it is a 

host for numerous known broad range plasmids including 
well-known expression plasmids of both high- and low-copy 
number, it also grows quickly relative to other 
actinomycete strains (for example about three times faster 
25 than wild type Saccharopolyspora erythraea the 
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erythromycin producer, under comparable conditions) and 
sporulates relatively easily. Heterologous polyketides can 
be expressed in Streptomyces cinnamonensis using for 
example the low-copy number plasmid pCJR24 (which has no 
5 origin of replication active in actinomycetes' so is 

maintained by integration into the chromosome) (Rowe, C. 
et al. Gene (1998) 216:215-223) or the related plasmid 
pCJR2 9 in which the polyketide synthase gene(s) are placed 
under the control of the actJ promoter which is activated 
10 by the ActII-orf4 activator; or alternatively the monAI 

promoter can be substituted together with the MonRI 
activator; or some other pairing of activator and cognate 
promoter chosen from either a Type II or a Type I 
polyketide synthase gene cluster. As an example, the wild 
15 type strain of Streptomyces cinnamonensis has been used to 

express the plasmid pCJR29 (Rowe, C. et al. Gene (1998) 
216:215-223) containing as insert the three ORFs for the 
PKS governing the production of 6-deoxyerythronolide B, 
the macrolide precursor of erythromycin A in 
20 Saccharopolyspora erythraea, these genes being placed 

under the control of the pathway^pecif ic acti promoter 
from 5trepto7nyces coelicolor together with its cognate 
activator gene act Jl-orf 4. The transformed strain when 
cultivated in a suitable liquid medium produced 6- 
25 deoxyerythronolide B in good yield. 
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It is well known to the person skilled in the art 
that it is possible to use standard vectors unable to 
replicate in actinomycetes to introduce DNA into a 
Streptomyces cell, such DNA comprising two portions of 
5 contiguous DNA which are each identical to one of two 

portions of the cell's chromosome that are spaced up to 
100 kbp apart; and that through recombination between the 
incoming DNA and the chromosome occurring in both portions 
of DNA the net result is that the chromosomal sequence is 

10 replaced by the defective sequence originally that of the 

incoming DNA. Such a procedure has been applied to the 
monensin-producing strain of S. cinnamonensis as described 
in detail below, and a strain of S. cinnamonensis has been 
obtained that carries a specific deletion in the monensin 

15 cluster and which is unable to produce the antibiotic- The 

use of such a strain facilitates the production of 
heterologous polyketides by removal of the background of 
monensin production - 

The multiple uses of portions of the cloned and 

20 sequenced DNA from the monensin cluster will readily occur 

to the person skilled in the arrr* A surprising feature of 
the PKS of the monensin cluster is an unusual mechanism of 
polyketide chain initiation. We have found that the 
monensin PKS loading module has three domains, which from 

25 the amino-terminus of the protein are: a KSq domain, an 
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acyltransf erase domain and an ACP domain. We have 
uncovered this organisation in the PKS for the 14-membered 
macrolide oleandomycin as well as in the monensin PKS, an 
organisation of the loading module previously only found 
5 for the 16-membered macrolides and in which tTie KSq domain 

(which looks like a ketosynthase or condensation domain 
except that the active site cysteine residue is 
substituted by a glutamine for which the single letter 
notation is Q) had been previously speculated to have no 

10 function. It was realised that the acyltransf erase of the 

loading module actually has malonyl-CoA and not acetyl-CoA 
as a substrate and that KSq is an active decarboxylase. It 
appears that a better discrimination can be achieved in 
the selection of the smaller acetate unit over propionate 

15 if the choice is made initially between methylmalonyl- and 

malonyl-CoA. 

An unprecedented feature of the monensin PKS genes is 
that no integral chain-terminating domain is present as a 
C-terminal appendage of the PKS extension module that 

20 catalyzes the twelfth and final chain extension. Because 

the product of the monensin PKS a carboxylic acid, it 
would have been firmly predicted that chain release would 
have been catalyzed by such a C-terminal domain containing 
a "'thioesterase" activity. Previously sequenced PKS gene 

25 sets have been of two sorts: first, those macrolide PKSs 
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typified by erythromycin, spiramycin, tylosin, niddamycin 
which have a readily recognisable C-terminal 
''thioesterase" domain, which in these enzymes functions as 
a specific cyclase rather than releasing the polyketide 
5 product as a free carboxylic acid; secondly, Tihose 

macrolide PKSs typified by rapamycin, FK506, and 
rifamycin, where there is an alternative and recognised 
mode of chain termination by transfer of the polyketide 
chain to an acceptor moiety, catalyzed by a specific 

10 enzyme (eg pipecolate incorporating enzyme for rapamycin 

(Schwecke T. et ai. Proc. Natl. Acad. Sci. USA (1995) 
92:7839-7843) and FK506 (Mothamedi H. and Shafiee A, Eur. 
J. Biochemistry (1998) 256:528-534); arylamine synthetase 
for rifamycin (August P.R. et al. Chemistry & Biology 

15 (1998) 5:69-79) . 

The monensin PKS surprisingly falls into neither 
category, and therefore seems to be the first example of a 
novel mode of chain termination. It is novel and 
noteworthy in this connection that the monensin PKS gene 

20 cluster contains two small genes that encode discrete, 

monofunctional thioesterase enzylltes. Although many PKS 
gene clusters have been previously shown to contain one 
such discrete thioesterase, none have been shown to have 
two. The role of such thioesterases is not known, although 

25 in the case of methymycin/pikromycin PKS, which has been 
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reported to be responsible for the biosynthesis of both 
the 12-inembered macrolide methymycin and the 14-membered 
macrolide pikromycin (Xue Y.Q. Proc. Natl. Acad. Sci. USA 
(1998) 95:12111-12116) the disruption of this thioesterase 
5 reportedly caused a ten-fold drop in the amount of both 

macrolides produced. A similar finding has been reported 
for the discrete thioesterase of the tylosin PKS gene 
cluster (Cundliffe E. et al. Chemistry & Biology in 
press) . Additional copies of such thioesterases may 

10 therefore accelerate the production of specific 

polyketide, but this has not yet been demonstrated. 
However, the presence of the discrete thioesterase is not 
completely essential for polyketide production. 

It is highly desirable to have a broadly effective 

15 method of catalysing the release of polyketide gene 

products from a PKS as the free acid. The well-studied 
integral thioesterase domain in the erythromycin PKS 
thioesterase has a broad specificity in cyclization to 
form a lactone (assuming that a hydroxy group is present 

20 in the growing polyketide chain at an appropriate 

position) , but hydrolysis to forrrr the free acid is very 
slow. The recognition of the unusual arrangement of the 
monensin PKS means that it is now possible to harness 
either the entire PKS module that catalyses the twelfth 

25 and final extension cycle in monensin biosynthesis, or the 
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C-terminal portion of it, and graft it onto a different 
polyketide synthase by genetic engineering, so as to allow 
the release mechanism characteristic of monensin to 
operate in a different context. The use of this portion 
5 only of the monensin PKS suffices to allow the novel 

mechanism of chain release to operate successfully. The 
speed of the polyketide chain hydrolysis in a given case 
can depend on the additional presence of one or both of 
the discrete thioesterase genes (monAIX and monAX) from 

10 the monensin gene cluster. The use of this novel method of 

chain termination represents a valuable way of generating 
a large number of novel engineered polyketides that are 
currently inaccessible, and ensuring that the products 
have a specified chain length, 

15 The genes monBI and monBII appear to encode very 

similar enzymes with significant amino acid sequence 
similarity to authentic ketosteroid isomerases which are 
known to catalyse the migration of an activated carbon* 
carbon double bond. The conservation of active site 

20 residues makes it very likely that these mon genes govern 

a reaction involving activated double bonds in the 
biosynthetic pathway to monensin and this surprising 
observation can be accommodated if the initial product of 
the polyketide chain growth on the monensin PKS is a 

25 linear precursor in which the double bonds were initially 
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formed with a conventional trans or E (entgegen) geometry; 
but before the polyketide chain was extended by insertion 
of the next unit the monBI and/or the monBII gene 
product (s) catalyse the specific rearrangement of the 
5 newly-created double bond into the cis or Z (zusammen) 

geometry. This new view of the monensin biosynthetic 
pathway allows the deduction that the monBJ and monBII 
genes, perhaps in combination with specific portions of 
the monensin modules where they normally exert their 

10 effects (namely modules 3, 5 and 7) might be used in order 

to achieve the extremely desirable targetted biosynthesis 
of novel polyketides containing double bonds with Z 
geometry at specified point (s) along the chain. Thus for 
example it should be possible to provide for the direct 

15 biosynthesis of C22-C23 cis or Z double bond in 

avermectins, thus avoiding tedious and expensive chemical 
conversion of an initial fermentation product into this 
important anthelminthic , Only limited experimentation is 
needed to see whether the monBI and/or monBII gene 

20 products are sufficient or whether the mon PKS at modules 

3, 5 and 7 forms part of the specific docking site(s) for 
the isomerases and therefore must also be used in the 
creation of the hybrid PKS that will insert the cis or Z 
double bond at the desired position- The substrate 

25 specificity of the isomerases need not be limited to 2,3- 
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unsaturated thioesters. The purified enzymes could also be 
used to effect such isomerisations in vitro, depending on 
the position of the equilibrium or whether further enzymes 
are used to achieve the further transformation of the 
product as it is formed (vide infra) . 

The product of the monCI gene is a novel oxidative 
enzyme with some sequence similarity to authentic examples 
of such enzymes in the databases; and with a clearly 
definable role in the monensin biosynthetic pathway, the 
epoxidation of the double bonds at three separate 
positions in the initially-formed acyclic intermediate in 
monensin biosynthesis. This epoxidase could therefore be 
used in conjunction with monBI/monBII gene products to 
effect oxidative reactions on suitable substrates in vitro 
and in vivo. Similarly the monCII gene product is a 
putative cyclase that opens the epoxides and causes the 
formation of ether rings in monensin. 

Any or all of the monBI^ monBII^ monCI or monCII 
genes may be introduced into a heterologous strain 
containing the gene cluster for another polyether, in 
order to divert the biosyntheticT^pathway and produce a 
polyketide of altered structure. In these experiments the 
analogues of these monB genes could either be present or 
(once located and characterised using the mon genes as 
probes) they may be deleted prior to the introduction of 
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the monB and monC genes into that strain. The converse 
experiment in which analogues of the monB and monC genes 
from other strains are introduced into S. cinnamonensis 
likewise has the potential to produce novel oxidised 
5 polyketides. Also, the monB and monC genes or- their 

analogues may be introduced into a strain that normally 
produces a macrolide or a polyene or some other complex 
polyketide and expressed there, when they may effect the 
diversion of the growing polyketide chain on a 
10 heterologous modular PKS towards a new product, which may 

or may not have the structure of a polyether. 

The availability of the monensin gene sequence allows 
the institution of domain swaps to alter the 

15 acyltransf erase (AT) specificity of a given module, for 

example the ethylmalonyl-CoA specific extender found in 
one of the modules of the monensin PKS can be used to 
replace one of the other ATs to generate an ethyl side 
branch at that position in the chain, or the AT can be 

20 used to substitute in any other (e.g. macrolide) PKS, as 

described in WO 98/01571 and WO-a8/01546. Similarly the 
alteration of the level of reduction in a module, by 
manipulation of the reductive enzymes, can be applied to 
the monensin genes and here it will produce, depending on 

25 which module is affected, either an altered monensin, or a 
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species which is only partly cyclised, or a polyether with 
an altered pattern of cyclisation, or even a linear 
polyketide . 

In general the targetted alteration of the pattern of 
5 substitution of sidechains or reduction level- along the 

polyketide chain produced by the monensin PKS will, like 
the disruption or deletion of the oxidative enzymes 
mentioned above, lead to non-polyether polyketide 
products. It should be possible, by introduction of the 

10 DEBS thioesterase at the C-terminus of one of the later 

modules of the monensin PKS, together with an 
appropriately placed hydroxy group earlier in the chain, 
to produce novel macrolide products from this polyether 
PKS system, or alternatively novel polyenes of defined 

15 chain length and chosen ring size. 
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Example 1 

Cloning of the monensin A biosvnthetic gene cluster using 
DNA probes derived from the erythromycin-producing 
polvketide synthase of Saccharooolvsoora ervthraea 
5 A genomic library of the monensin A producing strain 

Streptomyces cinnamonensis ATCC 15413 was constructed 
using methods well-known in the art, namely, the 
production of high molecular weight genomic DNA, followed 
by the partial cleavage of this DNA using the frequent- 

10 cutting restriction enzyme Sau3A, fractionation of the 

fragments on a sucrose gradient and selection of fragments 
of average size 35-40 kbp, and the cloning of these 
fragments into the cosmid vector pWE15 (Evans, G,A. et ai. 
Gene (1989) 79:9-20) which had been previously digested 

15 with BamHI and treated with shrimp alkaline phosphatase. 

The library was packaged and transfected into Escherichia 
coli XL-1 Blue MR cells. The library was plated out on 
2xTy agar medium (10 g tryptone, 10 g yeast extract, 5 g 
NaCl, 15 g bactoagar per litre containing ampicillin 50 

20 ixg/ml) for cosmid selection and the colonies were allowed 

to grow overnight. The library was then screened by 
hybridisation using as a probe DNA encoding the 
ketosynthase domain of module 1 of the erythromycin- 
producing PKS ( 6-deoxyerythronolide B synthase, DEBS) of 

25 Saccharopolyspora erythraea. The colonies giving a 
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positive hybridisation signal in the hybridisation were 
selected and the cosmid DNA from each colony was purified 
and mapped by restriction digestion. The presence of the 
target biosynthetic genes on a cosmid was verified by 
5 sequencing of the ends of the cosmid inserts rising the 

commercially available T3 and T7 primers which hybridise 
specifically to the respective ends of each cosmid insert 
(Evans, G.A. et al. Gene (1989) 79:9-20). 
Example 2 

10 Sequencing of the biosynthetic gene cluster for monensin A 

from StreiDtomvces cinnamonensis 

Three cosmids obtained by screening of the genomic 
library of S. cinnamonensis were used to obtain the entire 
DNA sequence of the monensin biosynthetic gene cluster. 

15 These cosmids, MO.CN02, MO.CNll and MO.CN33 between them 

contain the entire DNA sequence of the cluster and the 
adjacent regions of the chromosome. They have been 
deposited in NCIMB, 23 St Machair Drive, Aberdeen AB24 
3RY, UK, under the NCIMB accession numbers 40956 

20 (MO-CNll); 40957 (MO-CN33) and 40958 (MO-CN02) 

respectively. 

The DNA of each cosmid was separately subjected to 
partial digestion with Sau3A and fragments of 
approximately 1.5-2.0 kbp were separated by agarose gel 

25 electrophoresis. The fragments were then ligated into the 
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plasmid vector pUC18 (Messing, 1982), previously digested 
with BamHl and treated with shrimp alkaline phosphatase. 
The library was transformed into E. coli strain XLl-Blue 
MR and plated on 2xTY agar medium containing ampicillin 
5 (100 pg/ml) to select for plasmid-containing <:ells . 

Plasmid DNA was purified from individual colonies and 
sequenced using the Sanger dye-terminator procedure on an 
ABI 377 automated sequencer (Sanger, F, Science (1981) 
214:1205-1210). The sequence data obtained from single 

10 random subclones of a cosmid was assembled into a single 

continuous sequence and edited using GAP4 , 1 program of the 
STADEN gene analysis package (Staden, R. Molecular 
Biotechnology (1996) 5:233-241) . 

The sequence is set out in the appended sequence 

15 listing. 

Tables I and II contain data about individual genes 
and gene products. 
Example 3 

Inactivation of the monensin A biosvnthetic gene cluster 
20 A chromosomal gene disruption experiment was used to 

verify the identity of the cloned polyketide synthase gene 
cluster. Plasmid pMOB6314 is a pUC18 sequencing subclone 
of the presumed monensin A biosynthetic gene cluster 
prepared as described in Example 1, whose inserted DNA 
25 comprises the DNA sequence from nucleotide 97 63 to 
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nucleotide 10108 in SEQ ID 1, and which therefore contains 
a region of DNA wholly internal to orfE, a putative 3-0- 
methyltransf erase , A Hindlll fragment containing the 
thiostrepton resistance gene tsr from plasmid pIJ702 
5 (Katz, E. et ai- J. Gen. Microbiol. (1983) 129:2703-2714) 

was cloned into the Hindlll site of plasmid piyiOB6314 and 
the ligation mixture was used to transform E. coli cells. 
Transformants bearing the required plasmid pMOAEOl were 
identified by isolation of plasmid DNA and analysis by 

10 restriction digestion. pMOAEOl. Plasmid pMOAEOl was used 

to transform protoplasts of Streptomyces cinnamonensis as 
described by (Hopwood D.A. et al, (1985) ) . Since plasmid 
pMOAEOl lacks an origin of replication that is active in 
Streptomyces r growth in the presence of thiostrepton (25 

15 vig/ml) in the regeneration medium led to the isolation of 

stable integrants. Isolated putative integrants were 
tested for the presence of integrated pMOAEOl sequences by 
Southern hybridisation. A clone of Streptomyces 
cinnamonensis identified by its restriction pattern in 

20 Southern hybridisation as bearing pMOAEOl integrated in 

the region of monE of the monensin A biosynthetic gene 
cluster was designated 5. cinnamonensis MO-DDOl. 

Detection of production of the monensin A related 
metabolites produced by S. cinnamonensis MO-DDOl was 

25 performed by GC-MS analysis of methanol extracts of the 
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entire broth harvested in 72 hours of growth of the 
strain- No significant amounts of monensin A-related 
metabolite production were detectable. 
Example 4 

5 Overproduction of erythromycin aalvcone in St:rejDtomvces 

cinnamonensis 

S. cinnamonensis is a suitable system for 
overproduction not just of monensin A but also of other 
polyketide metabolites. Established techniques of genetic 

10 transformation allow fast introduction of foreign 

polyketide producing genes sets into this host. Fast 
growth of S. cinnamonensis in liquid culture and optimal 
precursor supply favour high yield of polyketide 
metabolites . 

15 Construction of pIB061 

5. erythraea NRRL2338 was transformed with pCJR30 
(Rowe, C. J,, et al. (1998) Gene 216:215-223) using a 
routine protoplast transformation technique as described 
by Hopwood et al. (1985). A stable integrant of 5. 

20 erythraea [pCJR30] was identified and the production of 

lOmg/L of the triketide lactone (delta lactone of 
{23, 3R, 4R, 5R) -2, 4-dimethyl-3, 5-dihydroxy-heptanoic acid) 
in addition to erythromycins was confirmed by MS 
analysis . 

25 Total DNA of S. erythraea [pCJR30] was purified and 
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approximately 200 ng was digested with EcoRl endonuclease. 
The digestion mixture was precipitated with isopropanol 
and the resulting DNA was treated with T4 DNA-ligase for 
16 hours at 16°C . The ligation mixture was used to 
5 transform E.coli DHIOB cells. The transf orma"hts were 

screened for the presence of the plasmid. A clone 
containing a 44,7kb plasmid was identified and confirmed 
by restriction analysis to contain three complete genes: 
eryAI, eryAII and eryAIII . The plasmid was named pIB0 61, 

10 Transformation of S. cinnamonensis 

Protoplasts of S. cinnamonensis were prepared by a 
modified procedure of Hopwood et ai. (1985). Plasmid 
pIB061 was transformed into the protoplasts of S. 
cinnamonensis and stable thiostrepton resistant colonies 

15 were isolated. Individual colonies were checked for their 

plasmid content and the presence of plasmid pIB061 was 
confirmed by its restriction pattern. S. cinnamonensis 
(pIB061) was inoculated into 250 ml of M-C3 minimal 
production medium containing 10 ^g/ml of thiostrepton and 

20 allowed to grow for 72 hours at 30 °C. After this time the 

mycelia were removed by filtering. The broth was extracted 
with two volumes of ethyl acetate and the combined ethyl 
acetate extracts were washed with an equal volume of 
saturated sodium chloride, dried over anhydrous sodium 

25 sulphate, and the ethyl acetate was removed under reduced 
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pressure to give about 200 mg of crude product. The 
product was analysed by LCQ and mass was confirmed to that 
of erythronolide B. 

This example demonstrates the importance of S. 
5 cinnamonensis for production of high levels oT foreign 

polyketide antibiotics. Introduction of the complete 
erythromycin gene cluster or other gene clusters into this 
system are likely to produce high levels of the 
corresponding metabolites. 

10 Example 5 

Construction of plasmid pCJW58 containing the monensin 
activator gene under the ermE^ promoter 

The ermE* promoter derived from the ermE resistance 
methyltransf erase gene of S. erythraea (Bibb at ai. Gene 

15 (1985) 38:215-226) was amplified by PGR as a Spel-Xjbal 

fragment using the following oligonucleotides 
5'-CCACTAGTATGCATGCGAGTGTCCGTTCGAGT-3' and 5'- 
TTGTATACACCTAGGATGGTTGGCCGTGC-3' with pRH3 (Dhillon et al. 
Molecular Microbiology (1989) 3:1405-1414 as a template 

20 and cloned into Sinal-digested, phosphatase-treated pUC18, 

to produce plasmid pIB135. The integrative plasmid pSET152 
(Bierman, M. et al. (1992) Gene 116:43-49)) was digested 
with Xbal and the backbone was dephosphorylated and 
ligated to the Spel-Xbal fragment of pIB135 containing the 

25 ermE* promoter. The ligation mixture was used to 
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transform E. coli DHIOB and the orientation of the insert 
in the plasmids from individual clones was checked by 
using restriction analysis. A plasmid with the er/nE* 
promoter oriented so that the Ndel and Xbal sites are 
5 adjacent to the apramycin resistance gene was- selected and 

named pIB139. 

The monR gene from the monensin biosynthetic gene 
cluster was amplified and Ndel and Xbal restriction sites 
introduced at 5' and 3' ends respectively, by PGR using as 
10 primers the following oligonucleotides: 

5'-AGA TAG CAT ATG CTG GGG GGG GTC GGG AT -3' 
and 5'-AAT GGT GTA GAG TGT GAG GGA GGG GAG AGG GGG AA-3' 
and cosmid MO.GNll as template. The PGR product was 
ligated into Smal-treated and phosphatase-treated plasmid 
15 pUG18 and the ligation mixture was used to transform E. 

coli DHIOB cells. Transformant colonies were analysed for 
the presence of plasmid and the identity of the plasmid 
inserts was verified by sequencing. A plasmid whose 
insert contained the monR gene flanked by Ndel and Xbal 
10 restriction sites was selected and designated pG JW57 . 

Plasmid pGJW57 was digeste(3*Vith Ndel and Xbal and 
the fragment containing the monR gene was ligated together 
with the backbone of plasmid pIB139 which had been 
digested with the same two restriction enzymes, and 
:5 purified by gel elution. The ligation mixture was used to 
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transform E. coli strain DHIOB cells. Transf ormant 
colonies were analysed for the presence of plasmid and the 
identity of the plasmid inserts was verified by 
restriction analysis. One such recombinant was selected 
5 and named plasmid pCJW58. 

Plasmid pCJW58 was used to transform the methylation- 
deficient E. coli strain ET 12567 (MacNeil D. J. et al. 
(1992) Gene 111:61-68) and the recovered, unmethylated 
plasmid was then used to transform the same E. coli strain 
10 ET12567 housing the plasmid pUB307, a derivative of RP4 

which is mob~ and which contains a gene for kanamycin 
resistance (Piffaretti, J. C. et al. (1988) Mol. Gen. 
Genet. 212:215-218). Recombinants were plated on 2 x TY 
agar medium containing apramycin and kanamycin at final 
15 concentrations of 50 micrograms per ml and 50 micrograms 

per ml respectively. The plasmid content of recombinants 
was checked isolation of plasmid DNA and checking of the 
identity of these plasmids by restriction analysis. One 
such clone which contained both pUB307 and plasmid pCJW58 
20 was selected and used for further experiments. 

Construction of Streptomyces^ cinnamonensis (pCJW58) 
and production of monensins 

A single colony of E. coli ET12567 housing both 
pUB307 and pCJW58 was toothpicked into 3 ml of TY liquid 
25 medium, containing apramycin and kanamycin at 25 and 25 
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micrograms respectively, and grown overnight at 37 This 
culture was used to inoculate 25 ml of TY medium, 
supplemented with the same antibiotics at the same 
concentrations, and growth was continued until the 
5 absorbance at 600 nm (1 cm pathlength) was between 0.3- 

0,6. The cells were centrifuged (room temperature, 7 
minutes, 2000 x g) , resuspended in TY liquid medium (10 
ml) containing no added antibiotics, re-centrif uged as 
before, then resuspended in 2ml of TSB medium and placed 

10 on ice. Meanwhile, 0.5 ml of TSB medium was added to 100 

microL containing approximately 10® spores of S. 
cinnamonensis. After a brief heat shock, at 50°C for 10 
minutes, the suspension was briefly cooled, mixed with 
0.5 ml of donor E. coli cells, and plated on solid A 

15 medium, which has composition as follows: 

A medium 



Sigma wheat starch 5g 
Corn steep powder 1.25g 

20 Yeast extract 1.5g 

CaC03 ITJg 
FeSO^ 6 mg 

DIFCO agar lOg 
H2O to 500 ml 



25 pH adjusted to pH 7 with KOH. 
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And to which in addition was added 10 luM MgCls to a 
final concentration of 10 mM. 

The plates were allowed to dry overnight at room 
5 temperature, and were then allowed to incubate a further 

18 hours at SCC. After this time each 25 ml plate was 
overlaid with a solution of apramycin (final concentration 
50 micrograms per ml) and nalidixic acid (final 
concentration 20 micrograms per ml) , and the plates were 

10 allowed to incubate for four days at 30°C. At this time 

individual colonies were toothpicked onto solid A medium 
and allowed to grow. Four representative colonies from 
the A medium plate were grown up in liquid modified YEME 
medium, which has composition as follows: 

15 Modified YEME medium 

Sucrose lOOg 
DIFCO Yeast extract 3g 
Bacto peptone 5g 
Oxoid Malt extract 3g 

20 Glucose lOg 

H2O to IL 

pH adjusted to pH 7,2 with NaOH. 

These cultures were used to provide a 2% vol/vol 
inoculum for 30 ml of modified YEME which was grown for 7 
25 days, and then transferred to SM16 medium, which has 
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composition as follows: 
SMI 6 medium 

3- [N-Morpholino] -propane sulfonic acid 

(MOPS) buffer :20.9g 
L-proline 10. Og 

Glucose 20g 
NaCl 0.5g 
K2HPO4 2 . Ig 

Ethylenediaminetetraacetic acid, sodium 

salt 0.25g 
MgSO^.THsO 0.4 9g 

CaCl2-2H20 0.029g 
Trace elements solution (Hopwood, 
15 D. A. et al. (1985) Genetic Manipulation 

of Streptomyces - a Laboratory Manual, 
at p-235) 2 ml 

0.5 M C0CI2 solution 2 microlitres 

H2O to IL 

20 pH adjusted to pH 7 with NaOH. 

After growth for a further days, mycelium was 
collected by centrif ugation at 2000 x g for 30 minutes, 
and the supernatant was extracted three times with 300 ml 
of ethyl acetate. The combined extracts were concentrated 

25 by evaporation under reduced pressure to an oil, which was 
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mixed with 1 ml of methanol. Samples were applied to an 
LCQ liquid chromatograph fitted with a mass spectrometer 
detector unit. The column used was a C18 reversed phase 
column, equilibrated with a mixture of 80% 20mM ammonium 
acetate/20% acetonitrile, and the column was 'eluted with a 
gradient of increasing acetonitrile, reaching 100% 
acetonitrile over 24 minutes. Monensins A and B emerged 
from the column with retention times respectively of 8.2 
minutes and 9.2 minutes. The relative amounts of monensin 
produced by three independent clones (A-C) containing an 
additional copy of the monR gene were compared to a 
control fermentation of the wild type S. cinnamonensis 
strain, with the results shown in the Table below: 
Table showing increased monensin production in strains 
15 bearing additional copy of monR gene 

Strain monensin A monensin B 

concentration concentration 
(arbitrary units) (arbitrary units) 
Control 188 861 

20 A 430 1 800 

B 450 1 300 

C 249 1 300 

Example 6^ 

Construction of cinnamonensis M12AT5 
2^ ^ region lying immediately 5' of the DNA encoding the 
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acyltransf erase {AT12 ) domain of module 12 of the monensin 
polyketide synthase in the monensin biosynthetic gene 
cluster was amplified with the following primers: 5'- 
GGTGGCCACGGAAACACCAACACCGGACCCGCGCC-3' , and 5'- 
5 CTCTCGGAGGCCCGGCGCAACGGCCACAA-3' , 3' using co-smid MO-CNll 

as a template- The PGR product was ligated into Smal 
digested and phosphatase-treated plasmid pUC18 and the 
ligation mixture was used to transform E. coli DHIOB 
cells. Transformant colonies were analysed for the 

10 presence of plasmid and the identity of the plasmid 

inserts was verified by sequencing. A plasmid whose 
insert contained a fragment upstream of the AT12-encoding 
sequence from about 82.3kb to 83-2kb of the mon cluster 
was designated pMOBl. Similarly a region lying immediately 

15 3' of the DNA encoding the acyltransf erase (AT12) domain 

of module 12 of the monensin polyketide synthase in the 
monensin biosynthetic gene cluster was amplified with the 
following primers: 5' -GGCCTAGGGCTGCCTCGGGTGGTGGATCTGCCGA- 
3' and 5'- TGGTCGGGCGCGGTGCGTGCGATACGT-3' , using cosmid 

20 MO-CNll as a template. The PGR product was ligated into 

Sjnal-treated and dephosphorylateek pUClB and the ligation 
mixture was used to transform DHlOB E.coli cells. 
Transformant colonies were analysed for the presence of 
plasmid and the identity of the plasmid inserts was 

25 verified by sequencing. A plasmid whose insert contained 
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a fragment downstream of the AT12-encoding sequence, from 
80.5kb to 81.4kb of the mon cluster, was designated pM082, 

The DNA encoding AT of module 5 was amplified and 
MscI and Avrll restriction enzyme recognition sites were 
5 introduced at the ends by PGR using the follcrwing primers: 

5' -CCTGGCCAGGGCGGCCAGTGGGTGGGCATG-3' and 5' - 
GGCCTAGGGGTCGGCCGGGAACCAGCGCCGCCAGT-3' and the cosmid MO- 
CN33 as a template. The PGR product was ligated into Smal- 
treated and dephosphorylated pUG18 and the ligation 

10 mixture was used to transform DHIOB E.coli cells. 

Transformant colonies were analysed for the presence of 
plasmid and the identity of the plasmid inserts was 
verified by sequencing. A plasmid whose insert DNA, with 
sequence from about 44-2kb to 45.2kb of the mon cluster, 

15 encoded the ATS domain was designated pM083. 

pM081 was digested with MscI and Hindlll and ligated 
to the 0.9kb Mscl - Hindlll fragment of pM082. A clone 
containing both fragments was designated pM084. Plasmid 
pM084 was cleaved with Avrll and Hindlll, treated with 

20 phosphatase, and ligated together with the 1.0 kb ^^vrll - 

Hindlll fragment of pM083 to pro^ce pM085, which contains 
the DNA encoding the ATS domain flanked by DNA from either 
side of the DNA encoding the AT12 domain of the monensin 
PKS. The thiostrepton resistance gene tsr, derived from 

25 plasmid pIJ702 (Katz, E. et al., J. Gen. Microbiol. 
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1983), was cloned into the Hindlll site of pM085. The 
resulting plasmid pM086 was analysed by its restriction 
pattern and confirmed to contain all the desired 
elements , 

Plasmid pM086 was used to transform S. ainnamonensis 
protoplasts as described by Hopwood, D. A. (1985) . Stable 
thiostrepton-resistant transf ormants were isolated and 
checked for the desired integration of the pM08 5 into the 
AT12 flanking regions by Southern blot hybridisation. One 
such integrant, S. cinnamonensis MO-08, containing pM085 
integrated upstream of the AT12, was passed through 4 
cycles of sporulation on a non-selective nutrient 
medium. Spores obtained after the fourth cycle were 
replica-plated onto media with and without thiostrepton. 
DNA of clones that had lost thiostrepton resistance was 
analysed by Southern blot hybridisation. Clones in which 
the DNA encoding the AT12 domain had been replace by the 
DNA encoding the ATS domain was designated S. 
cinnamonensis M12-AT5. At this time individual colonies 
were toothpicked onto solid A medium and allowed to grow. 
Four representative colonies f rdTTT the A medium plate were 
grown up in liquid modified YEME medium, which has 
composition as follows: 
Modified YEME medium 



Sucrose lOOg 

DIFCO Yeast extract 3g 

Bacto peptone 5g 

Oxoid Malt extract 3g 

5 Glucose lOg 

H2O to IL 



pH adjusted to pH 7.2 with NaOH. 

These cultures were used to provide a 2% vol/vol 
inoculum for 30 ml of modified YEME which was grown for 7 
10 days, and then transferred to SM16 medium, which has 

composition as follows: 
SHIS medium 



3- [N-Morpholino] -propane sulfonic 

15 acid (MOPS) buffer 20. 9g 

L-proline 10. Og 

Glucose 20g 

NaCl 0.5g 

K2HPO4 2.1g 
20 Ethylenediaminetetraacetic acid, 

sodium salt ^ 0-25g 

MgS04.7H20 0,4 9g 

CaCl2.2H20 0.029g 
Trace elements solution (Hopwood, 
25 D. A. et al. (1985) Genetic 
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Manipulation of Streptomyces - a 

Laboratory Manual, at p. 235) 2 ml 

0-5 M C0CI2 solution 2 microlitres 

H2O to IL 

5 pH adjusted to pH 7 with NaOH. 

After growth for a further 7 days, mycelium was 
collected by centrif ugation at 2000 x g for 30 minutes, 
and the supernatant was extracted three times with 300 ml 
of ethyl acetate. To confirm presence of the C-2-ethyl 

10 substituents of both monensin A and B the combined 

extracts were concentrated by evaporation under reduced 
pressure to an oil, which was mixed with 1 ml of methanol. 
Samples were applied to an LCQ liquid chromatograph fitted 
with a mass spectrometer detector unit. The column used 

15 was a C18 reversed phase column, equilibrated with a 

mixture of 80% 20mM ammonium acetate/20% acetonitrile, and 
the column was eluted with a gradient of increasing 
acetonitrile, reaching 100% acetonitrile over 24 minutes. 
Mass ions 14 mass units above those expected for both 

^0 monensin A and B confirmed production of the respective C- 

2-ethyl substituents. 

Example 7^ Construction of pSGKOOS and its use in the 
production of C-13 propvl-ervthromycin 

Plasmid pSGK005 is a pCJR24 based plasmid containing 
:5 a PKS gene comprising a loading module plus the first and 
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second extension modules and the chain terminating 
thioesterase of the PKS responsible for the production of 
erythromycin (DEBS) - The loading module comprises the KS 
and ethyl-malonyl CoA specific AT from module 5 of the 
5 monensin PKS linked to the DEBS loading AC? domain. In 

addition, the active site cysteine of this module 5 KS has 
been mutated to glutamine to convert an extender di-domain 
to a loading di-domain. Plasmid pSGKOOS was constructed 
as follows. 

10 A 2769bp DNA segment of the monensin cluster of S. 

cinnamonensis extending from nucleotide 42438 to 45207 was 
amplified by PGR using the following oligonucleotide 
primers . 5' -GTGACGTCATATGTCGAGTGCTGAAGAGTCG-3' and 
5 ' -GGGGTCGCCTAGGAACCAGCGCCGCCAGTCGA-3 ' 

15 The design of these primers introduced Nde I and Avr 

II sites at the ends of the amplifed fragment, Monensin 
cosmid 05 was used as a template for the reaction. The 
resulting 27 69bp fragment was digested with Nde I and Xho 
I and a 656bp fragment (Fragment A) purified by 

20 preparative gel electrophoresis. 

A second PGR reaction was uied with the same template 
to amplify DNA from nucleotide 43098 to 45207. The 
primers used were 

5 ' -GGGGGTGGAGGGCGGGTGGGTGAGTGTGGAGAGGGGGCAGTGCTCCTGGC-3 ' 
25 and 5 ' -GGGGTCGGGT AGGAAGGAGGGCGGGGAGTGGA- 3 ' 
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The design of the upstream oligonucleotide primer 
incorporated a change of the codon specifying the KS 
active site cysteine (nucleotides 43135-43137, TGC) to 
glutamine (CAG) . The resulting 2109bp DNA fragment 
5 (Fragment B) was digested with Xho I and Avr "11 and 

purified by preparative gel electrophoresis. 

Plasmid pCJWSO is derived from pCJR24 and DEBSl-TE in 
which Msc 1 and Avr II sites have been introduced to flank 
the AT of the DEBS loading module. This plasmid was 

10 digested with Nde I and Avr 11 and the larger fragment 

(Fragment C) purified by preparative gel electrophoresis. 

The three fragments (Fragments A, B, C) were ligated 
together using T4 DNA ligase and the ligation mixture used 
to transform electrocompetent E. coli DHIOB cells. 

15 Individual clones were checked for the presence of the 

desired plasmid pSGK005, The identity of pSGKOOS was 
confirmed by restriction pattern and sequence analysis. 

Plasmid pSGKOOS was used to transform S. erythraea 
NRRL2338 using a routine protoplast transformation 

20 technique. Thiostrepton resistant colonies were selected 

on R2T20 media containing g/mT* thiostrepton . Further 
analysis confirmed that pSGKOOS had integrated into the 5. 
erythraea NRRL2338 chromosome by Southern blot 
hybridisation of their genomic DNA with DIG-labelled DNA 

25 containing the actll orf4 promoter. The culture 5. 
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erythraea NRRL2338 (pSGKOOS) was inoculated into 5ml tap 
water medium in a 30ml flask. After three days 
incubation at 29°C this flask was used to inoculate 30ml of 
Ery-P medium in a 300ml flask. The broth was incubated at 
5 29°C at 200rpm for 6 days. After this time the whole broth 

was adjusted to pH8.5 with NaOH, and then extracted twice 
with an equal volume of ethyl acetate. The ethyl acetate 
extract was evaporated to dryness at 45°C under a nitrogen 
stream using a Zymark Turbovap LV evaporator. The product 
10 identities were confirmed by LC/MS. A peak was observed 

with a m/z value of 734 (M+H)^ required for erythromycin A. 
A second peak was observed with a m/z value of 748 (M+H)"', 
required for 13-propyl erythromycin A. 
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TABLE I 



gene 


function 


start 


end 


gdhA 


glutamate dehydrogenase (partial) 


1038 


0 


dapA 


dihydrodipicoiinate synthase 


2140 


1220 


orf3 


putative transcriptional activator 


2211 


3152 


orf4 


hypothetical protein 


3264 


3680 


orfS 


hypothetical protein 


4307 


3684 


orf6 


hypothetical protein 


4570 


4758 


orf7 


hypothetical protein 


5058 


5612 


acpX 


acyl carrier protein 


6010 


5693 


ksX 


ketoacyl synthase 


8531 


6045 


irionCI 


probable epoxihydrolase/cyclase 


9542 


8643 


monE 


methyltransferase 


10426 


9596 


monT 


monensin resistance gene (ABC- 


10656 


12191 


monRI 


probable repressor 


12205 


12780 


monAI 


thioesterase 


13829 


13023 


monAI 


polyketide synthase loading & 


14121 


23198 




KS-L 


14172 


15486 




AT-L malonate specific 


15777 


16880 




ACP-L 


17019 


17276 




KS1 


17358 


18626 




AT1 methylmalonate specific 


18960 


19976 




DH1 (potential) 


20019 


20519 




KR1 (inactive) 


21636 


22241 




ACP1 


22536 


22793 


monAI 


polyketide synthase module 2 


23205 


29921 




KS2 


23307 


24569 




AT2 methylmalonate specific 


24891 


25913 




DH2 


25953 


26369 




ER2 • 


27600 


28463 




KR2 


28485 


29042 




ACP2 


29313 


29570 


monAI 


polyketide synthase modules 3 & 4 


29974 


42372 




KS3 


30076 


31347 




AT3 malonate specific 


31798 


32838 




DH3 


32884 


33465 




KR3 


34692 


35181 




ACP3 


35553 


35811 




KS4 


35899 


37170 




AT4 methylmalonate specific 


37489 


38511 




DH4 


38557 


38982 




ER4 


40123 


40986 




KR4 


41005 


41562 




ACP4 


41848 


42105 


monAI 


polyketide synthase modules 5 & 6 


42448 


54564 




KS5 


42628 


43890 




ATS ethylmalonate specific 


44221 


45243 




DH5 


45289 


45744 




KR5 


46785 


47337 




ACP5 


47593 


47850 
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KS6 


47947 


49218 




AT6 malonate specific 


49579 


50601 




DH6 


50644 


51075 




ER6 


52222 


53102 




KR6 


53101 


53661 




ACP6 


54052 


54306 


monA 


polyketide synthase modules 7 & 8 


54614 


66934 




KS7 


54716 


55978 




AT7 methylmalonate specific 


56300 


57319 




DH7 


57358 


57802 




KR7 


59048 


59608 




ACP7 


59867 


60124 




KS8 


60185 


61453 




ATS malonate specific 


61808 


62839 




DHB 


62882 


63316 




ER8 


64577 


65437 




KR8 


65456 


66016 




ACP8 


66404 


66661 


monA 


polyketide synthase module 9 


66952 


72054 




KS9 


67075 


68340 




AT9 malonate specific 


68698 


69729 




KR9 (potential) 


70735 


71262 




ACP9 


71536 


71783 


monH 


probable regulator 


72051 


74993 


monCI 


FAD containing epoxidase 


76541 


75051 


monBI 


double bond isomerase 


76960 


76538 


monBI 


double bond isomerase 


77450 


77016 


monA 


polyketide synthase modules 1 1 & 


88708 


77447 




KS11 


88612 


87344 




AT1 1 methylmalonate specific 


87022 


85993 




KR11 


85111 


84562 




ACP11 


84292 


84035 




KS12 


83962 


82694 




AT12 methylmalonate specific 


82354 


81335 




DH12 (potential) delta 


81286 


80855 




ER12 (potential) 


79618 


78914 




KR12 


78895 


78337 




ACP12 


78070 


77812 


monA 


polyketide synthase module 10 


93741 


88816 




KS10 


93636 


92368 




AT10 methylmalonate specific 


92040 


91021 




KR10 


90132 


89584 




AGP10 


89322 


89068 


monD 


P450 oxygenase 


94081 


95273 


monRI 


probak>{e activator 


96141 


95338 


monA 


thioesterase 


96941 


96138 


orf29 


cell wall biosynthesis capK 


97580 


98953 


lipB 


lipase B 


99983 


98991 


orf31 


ion pump 


101433 


100507 


orf32 


membrane structural protein 


102581 


101490 


amtA 


glycine amidinotransferase 


102924 


103450 
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TABLE II 

GdhA, glutamate dehydrogenase (partial coding sequence) Length: 346 
amino acids 

1 LTTRPDTKTA LSQKTALSQL LTEIEHRNPA QPEFHQAARE VLETLAPVIA 
51 ARPEYAEAGL lERLCEPERQ IVFRVPWQDD HGRVRVNRGF RVEFNSALGP 
101 YKGGLRFHPS VNLGVIKFLG FEQIFKNALT GLGIGGGKGG SDFDPRGRSD 
151 ~ AEVMRFCQSF MTELYRHIGE_^HTDVPAGDIG VGGREIGYLF GQYRRITNRW 
201 EAGVLTGKGR NWGGSLIRPE ATGYGNVLFA AAMLRERGET LEGRTAWSG 
251 SGNVAIYTIQ KLAALGANAV TCSDSSGYW DEKGIDLDLL KQVKEVERAR 
3 01 VDTYAQRRGA SARFVPGRRV WEVPADIALP SATQNELDAD DATALI 

DapA, dihydrodopicolinate synthase Length: 307 amino acids 

1 MTLASSLEPT TEPLFNGLYV PLVTPFTDDL RLAPEALARL ADEALSAGAS 

51 GLVALGTTAE AATLTAEERE TVIRVCSAAC RAHGAPLIVG VGTNDTATAI 

101 TALRELiAARG DVAAALVPAP PYIRPGEAGT LAHFAAIiAEH GGLPLWYDI 

151 PYRTGQTLGA GTITALGRLP EWGIKHATG SIDPTTMELL DSPLPGFAVL 

2 01 GGDDIVLSPL VAAGAHGGIV ASANLRTADY AEMIALWRRG SAAPARALGA 

2 51 DLARLSAALF TEPNPTVIKG VLHAQNRIPS PAVRMPLLAA SADSVRRAAP 

3 01 LAASRK* 

ORF3, putative transcriptional activator protein Length: 314 amino acids 



1 


MLDVRRLHLL 


REIiDRRGTIA 


AVAEALTFTA 


SAVSQQLGVL 


EREAGVPLLE 


51 


RSGRRWLTP 


AGRSLVAHAD 


AVLNRLEQAV 


AEIoAGARDGI 


GGPLRIGTFP 


101 


SGGHTIVPGA 


LAELASRHPA 


LEPMVREIDS 


ARVSDGLRAG 


ELDVALVHDY 


151 


DFVPATPDTT 


VDEVPLLEEP 


MYLVTHAADT 


ATDSGSGSTL 


AALLGPCAEV 


201 


PWITARDGTT 


GHAMAVRACQ 


AAGFQPRIRH 


QVNDFRTVLA 


LVAAGQGAGF 


251 


VPRMAAEPSP 


AGWLTKLPL 


FRRSKVAFRA 


GGGAHPAIAA 


FVAAATTAVE 
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301 RMAGSRGPAG GSE* 

ORF4, hypothetical protein Length: 139 amino acids 

1 MADDAYLFLL PDRHPRLGAA LAAVGALECT ETPAVHAWLQ AHEASVSSEQ 
51 VRILPADAET LIPICDAERLP VPLSEEEALK VEQECAPQTV TDMESELLAF 
101 RETTQDWQAL VHRALTAGIP AQRIARLTGL DPEEIGRL* 

ORF5, hypothetical protein Length: 208 amino acids 

1 LAVAACAAW LPIDAWRIS AADVGVLVFF AYLLPYLiAIT MTVFVSVAPE 
51 QVRSWARREA RGTFLQRYVL GTAPGPGGSL FIAAAALWA VLWLPGHLST 
101 TFSALPRTLV ALALWAAWI CVWAFAVTF QADNLVENER ALEFPGERSP 
151 AWADYVYFAL AAMTTFGTTD VDVTSRDMRR TVAANTVIAF VFNTVTVAIL 
2 01 VSALGGR* 

ORF6, hypothetical protein Length: 63 amino acids 

1 MTVMDKLKQM LKGHEDKAGQ GIDKAGDFVD GKTQGKYSGQ VDTAQDKLRD 
51 QFGSDQQEPP QR* 

ORF7, hypothetical protein Length: 185 amino acids 

1 MGTAQSQEQA AAPGACAAFV RFVLCGGGVG LASSFAWAL ASWVPWALAN 

51 ALVAWSTW ATELHARFTF GAGGRATWRQ HAQSAGSAAA AYAVTCVAMF 

101 VLQQLVAAPG AVLEQWYLS ASALAGVARF WLRLWFAR NRSLPAAAAV 

151 RTARPVRRVP APVPATVAHA ASRPAGPAAL CPAA* 

AcpX, acyl carrier protein (ACP) Length: 106 amino acids 

1 MTSTDHTSGQ DATELEKQLA AATPEEREKL LTDTIRTQAG TLLNTTLSDD 
51 SNFLENGLNS LTALELTKTL MTLTGMEIAM VAIVENPTPA QLAHHLGQEL 
101 AHTTA* 
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KsX, ketoacyl-ACP synthase Length: 829 amino acids 



1 


VANEEKLVEY 


LKWTTAELHQ 


AOOOLRELKA 


AOHEP T A W*^ 


M A r^p T .pr^ vTiD 


51 


TPDDLWDLVS 


EGRDAVTGFP 


DDRAWELPEE 


P P YA RT .nnPT . 

XV XT J.J^JZi±^\J\JC J_l 


xyx-frtj-iVji: U/^.vji:' 


101 


FDISDTEAVA 


TEPLQRLMLH 


LAWETVERGH 


T A PKTT .P <^TT , 


X ^ V X V ^j/i, 1 ^jri 


151 


DYATRLETAP 


DELLPYLGGG 


TSGSLVSGRI 


AYAT.nT.Pr^PA 

^ X rl.J-lVJl.llZiVj±ri-i 


X O V U X o o o 


201 


LVALHLiACOA 


LRRGECGLAL 


AGGGTVMSTP 


n X jr xifir /uiy IS. 


* 

o i-i/vy JJvjKC KP 


251 


- FAAAADGMGL 


GEGVGLVLLE RLGDARKNGH 


XT V iJLn. V X IVOOXT. 




301 


AAPNGPSQQH 


VIRAALADAG 


LTPDQIDAVE 


.nxx w X w X Xr X V3J^ 


ATTh'XT'OAT tat* 


351 


YGADRSPDRP 


LWLGSVKSNT 


GHTQGAAGAA 


A T . T TCMVn A PP 

^^XJ X X\X*1 V X\. 


n\j 1 Ljff X J-irl V 


401 


DRPTPIiAAWK 


KGAVRLLTEA 


VDWPRREEPR 


PVnT QAT?ATQ 




451 


PPVDEAPVPD 


AARDQTSPVA 


PELPVAWSLS 


APTPFAT.P An 


ATT AT "\7T'T4"T 7i A 


501 


TDPAPSPAEV 


AYSLAATRSP 


LEHRAVLTGT 


DWTPT .T . A A A P 




551 


LVRSTPGAGP 


KKIAWHFDGR 


PADGVTTGAA 


pn A pri A TT?r: 


Air kjAAr LjCjjA 


601 


EFHSAFPLFA 


SAFDEARALL 


DTHIiPTPLPT 


PHSELARFAV 


HTALARLLLE 


651 


TGVRPHTLTG 


DGVGHIAAAY 


AAGILTLDDA 


CRLAAAHAAA 


AQAAEGEQPA 


701 


PPDAYEPVLK 


QLTFQRATLT 


LTSTAPADTP 


lASADYWHHH 


LTSPAPTAPP 


751 


TPETHTLLHL 


GALSPEGTQT 


SAVSALLTAL 


ARLHTTGGTV 


DWTPLVRRTP 


801 


HPRTIDLPTY 


SFQATRYWLH 


DHTAHAAV* 






MonCII, probable epoxyhydrolase/cyclase Length: 300 amino acids 


1 


VKNLRIPVSQ 


TVSLNVRYRP 


ADGPGAPGRP 


FLLLHGMLSN 


ARMWDEVAAR 


51 


LAAAGHPAYA 


VDHRGHGESD 


TPPDGYDNAT 


WTDLVAAVT 


ALDLSGALVA 


101 


GHSWGAHLAL 


RLAAEHPDLV 


AGLALIDGGW 


YEFDGPVMRA 


FWERTADWR 


151 


RAQQGTTSAA 


DMRAYLRATH 


PDWSPTSIEA 


RliADYRVGPD 


GLLIPRLTST 


201 


QVMSIVAGLQ 


REAPADWYPK 


VTVPVRLLPL 


IPAIPQLSDQ 


VRAWVAAAEA 
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2 51 ALEQVSVRWY PGSDHDLHAG APDEIAADLL LLARSCEAMP GGKAGVRPA* 

MonE, S-adeonosylmethionine-dependent methyltransferase Length: 277 
amino acids 

1 VNKTVAPEPS DIGHYYDHKV FDLMTQLGDG NLHYGYWFDG GEQQATFDEA 
51 MVQMTDEMIR RLDPAPGDRV LDIGCGNGTP AMQLARARDV EWGISVSAR 
101 QVERGNRRAR EAGLADRVRF EQVDAMNLPF DDGSFDHCWA LESMLHMPDK 
151 _ QQVLTEAHRV VKPGARMPIA DMVYLNPDPS RPRTATVSDT TIYAALTDIG 
201 DYPDIFRAAG WTVLELTDIT RETAKTYDGY VEWIRAHRDE YVDIIGVEGY 
251 ELFLHNQAAL GKMPELGYIF ATAQRP* 

MonT, putative monensin resistance gene (ABC-transporter) Length: 512 
amino acids 

1 MSADLGARRW WAVGALVLAS MWGFDVTIL SLALPAMADD LGANNVELQW 
51 FVTSYTLVFA AGMIPAGMLG DRFGRKKVLL TALVIFGIAS LACAYATSSG 
101 TFIGARAVLG LGAALIMPTT LSLLPVMFSD EERPKAIGAV AGAAMLAYPL 
151 GPILGGYLLN HFWWGSVFLI NVPWILAFL AVSAWLPESK AKEAKPFDIG 
201 GLVFSSVGLA ALTYGVIQGG EKGWTDVTTL VPCIGGLLAL VLFVMWEKRV 
251 ADPLVDLSLF RSARFTSGTM LGTVINFTMF GVLFTMPQYY QAVLGTDAMG 
301 SGFRLLPMVG GLLVGVTVAN KVAKALGPKT AVGIGFALLA AALFYGATTD 
351 VSSGTGLAAA WTAAYGLGLG lALPTAMDAA LGALSEDSAG VGSGVNQSIR 
401 TLGGSFGAAI LGSILNSGYR GKLDLDGVPE QAHGAVKDSV FGGLAVARAI 
4 51 KSNGLiADSVR SAYVHALDW LWSGGLGLL GWLAWWLP RHVGQSTAKT 
501 AESEHEAADA V* 

MonRII, probable repressor protein Length: 192 amino acids 

1 VPGLRERKKA RTKAAIQREA VRLFREQGYT ATTIEQIAEA AEVAPSTVFR 
51 YFATKQDLVF SHDYDLPFAM MVQAQSPDLT PIQAERQAIR SMLQDISEQE 

-73- 



SUBSTITU;- ^Hr^u (n^LE 26) 



101 LALQRERFVL ILSEPELWGA SLGNIGQTMQ IMSEQVAKRA GRDPRDPAVR 
151 AYTGAVFGVM LQVSMDWAND PDMDFATTLD EALHYLEDLR P* 

MonAIX, thioesterase Length: 269 amino acids 

1 MDRGTAARAP QIGDEFGAAT GNGVWLRRYH AAAEAPVRLV CFPFAGGSAS 
51 YYFGLSGLLA PGVEV3LAVQY PGRQDRHAEP CLASVAELAD £WPHLPCDG 
101 KPFALFGHSL GAIVAFEVAR RLRGPAGPGL PVHLFVSGGL ARPYRPAGRS 
151 GAFGDADIIiA HLRAMGGTDE^ RFFRSPELQE LVLPALRADY RAVATYEAPG 
2 01 PGRLDCPITA LIGDADERTS PEQAATWRER TGAAFDLRVL PGGHFYLDGC 
2 51 QEQVAAWTE ALTAGPGV* 

MonAI, polyketide synthase multi-enzyme MONSl, housing loading module 
and extension module 1 Length: 3026 amino acids 



1 


MAASASASPS 


GPSAGPDPIA 


WGMACRLPG 


APDPDAFWRL 


LSEGRSAVST 


51 


APPERRRADS 


GLHGPGGYLD 


RIDGFDADFF 


HISPREAVAM 


DPQQRLLLEL 


101 


SWEALEDAGI 


RPPTLARSRT 


GVFVGAFWDD 


YTDVLNLRAP 


GAVTRHTMTG 


151 


VHRSILANRI 


SYAYHLAGPS 


LTVDTAQSSS 


LVAVHLACES 


IRSGDSDIAF 


201 


AGGVNLICSP 


RTTEIiAAARF 


GGLSAAGRCH 


TFDARADGFV 


RGEGGGLWL 


251 


KPLAAARRDG 


DTVYCVIRGS 


AVNSDGTTDG 


ITLPSGQAQQ 


DWRLACRRA 


301 


RITPDQVQYV 


ELHGTGTPVG 


DPIEAAALGA 


ALGQDAARAV 


PLAVGSAKTN 


351 


VGHLEAAAGI 


VGLLKTALSI 


HHRRLAPSLN 


FTTPNPAIPL 


ADLGLTVQQD 


401 


LADWPRPEQP 


LIAGVSSFGM 


GGTNGHWVA 


AAPDSVAVPE 


PVGVPERVEV 


451 


PEPVWSEPV 


WPTPWPVSA 


HSASALRAQA 


GRLRTHLAAH 


RPTPDAARVG 


501 


HALATTRAPL 


AHRAVLLGGD 


TAELLGSLDA 


LAEGAETASI 


VRGEAYTEGR 


551 


TAFLFSGQGA 


QRLGMGRELY 


AVFPVFADAL 


DEAFAALDVH 


LDRPLREIVL 


601 


GETDSGGNVS 


GENVIGEGAD 


HQALLDQTAY 


TQPALFAIET 


SLYRLAASFG 
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651 LKPDYVLGHS VGEIAAAHVA GVLSLPDASA LVATRGRLMQ AVRAPGAMAA 

701 WQATADEAAE QLAGHERHVT VAAVNGPDSV WSGDRATVD ELTAAWRGRG 

751 RKAHHLKVSH AFHSPHMDPI LDELRAVAAG LTFHEPVIPV VSNVTGELVT 

801 ATATGSGAGQ ADPEYWARHA REPVRFLSGV RGLCERGVTT FVELGPDAPL 

851 SAMARDCFPA PADRSRPRPA AIATCRRGRD EVATFLRSLA QAYVRGADVD 

901 FTRAYGATAT RRFPLPTYPF QRERHWPAAA GVGQQPETPE LPESSESSEQ 

951 AGHEREEGAR AWGGPEGRLA GLSVNDQERV LLGLVTKHVA WLGDASGTV 



1001 


QAARTFKQLG 


FDSMAAAELS 


ERLGTETGLP 


LPATLTFDYP 


TPIiAVAAHLR 


1051 


AELTGTPAPA 


GSAPATGALG 


AGDLGTDEDP 


VAIVAMSCRY 


PGGAGTPEDL 


1101 


WRLVADGADA 


IGDFPTDRGW 


DLARLFHPDP 


DRSGTSCTRQ 


GGFLYDAADF 


1151 


DAEFFDISPR 


EALAVDPQQR 


LLLECAWEAF 


ERAGLDPRAL 


KGSPTGVFVG 


1201 


MTGQDYGPRL 


HEPSQATDGY 


LLTGSTPSVA 


SGRLSFSFGL 


EGPALTVDTA 


1251 


CSSSLVTLHL 


AAQALRRGEC 


DliALAGGATV 


IiATPGMFTEF 


SRQRGLAPDG 


1301 


RCKPFAAGAD 


GTGWAEGVGL 


VLLERLSEAR 


RKGHAVLAVI 


RGSAINQDGA 


1351 


SNGLTAPNGP 


SQQRVIRAAL 


AAARLTADEV 


DWEAHGTGT 


TLGDPIEAQA 


1401 


LLATYGQGRS 


AERPLWLGSV 


KSNIGHTQAA 


AGVAGVIKMV 


MAMRHDLLPA 


1451 


TLHVDEPSGH 


VDWSTGAVRL 


LTEPWWPRG 


ERPRRAAVSS 


FGISGTNAHL 


1501 


VLEEAGQDEY 


VAGAADDAGP 


VDGAVLPWW 


SGRTGAALRE 


QARRLRELVT 


1551 


GGSADVSVSG 


VGRSLVTTRA 


VFEHRAVWG 


RDRDTLIGGL 


EAIiAAGDASP 


1601 


DWCGVAGDV 


GPGPVLVFPG 


QGSQWVGMGA 


QLLGESAVFA 


ARIDACEQAL 


1651 


SPYVDWSLTE 


VLRGDGRELS 


RVDWQPVLW 


AVMVSLAAVW 


ADHGVTPAAV 


1701 


VGHSQGEIAA 


VWAGALTLE 


DGAKIVALRS 


RALRQLSGGG 


AMASLGVGQE 


1751 


QAAELVEGHP 


GVGIAAVNGP 


SSTVISGPPE 


QVAAWADAE 


ARELRGRVID 


1801 


VDYASHSPQV 


DAITDELTHT 


LSGVRPTTAP 


VAFYSAVTGT 


RIDTAGLDTD 
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SUBSTITUTS SHEET {RULE 26) 




1851 


YWVTNLRRPV 


RFADAVTALL 


ADGHRVFIEA 


SSHPVLTLGL 


QETFEEAGVD 


1901 


AVTVPTLRRE 


DGGRARLARS 


LAQAFGAGCA 


VRWENWFPAT 


GTSTVELPTY 


1951 


AFQRRRYWLE 


APTGTQDAAG 


LGLAAAGHPL 


LGAATEIADG 


DIRLLTGRIS 


2001 


RHSHPWLAQH 


TLFGAAWPA 


SVLAEWALRA 


ADEAGCPRVD 


DLTLRTPLVL 


2051 


PETAGVQVQI 


WGPADARDG 


HRDFHVYARP 


DGKDASEGEG 


IAEGEGASEG 


2101 


EGASGGTDAP 


WTCHADGRLV 


AEPTGTASED 


S PDTVWP P PG^'AE P VDLGDF Y 


2151 


ERAAATGVGY 


GPVFTGLRAL 


WRRDGELFAE 


AVLPQEAPET 


AGFGMHPALL 


2201 


DAALHPALLG 


ERPAEEDIOm 


LPFTLTGVTL 


WATGATSVRV 


RLTPLDDDPD 


2251 


ASADGRAWRV 


GVSDPTGAEV 


LTCEALVAVA 


AGRRELRAAG 


ERVSDLYAVE 


2301 


WVPVPGPGPV 


GEGADFSGWA 


GLGECGERWE 


CVGRVERWYE 


DLDALGAAVE 


2351 


GGASVPSWL 


ATAAAAPGGA 


GDGAADALSA 


VRWTGALLDQ 


WLiADARFADA 


2401 


RLWITSGAV 


ATGDDFLPDP 


AAAAVRGLVE 


QAQVRHPGRI 


LLVDTEAGAG 


2451 


LGVGAGVDDA 


LLEQAVAMAL 


GADEPQLALR 


AGRVLAPRLT 


APQDAAVTEA 


2501 


ARPLDPDGTV 


LITGPAGAPV 


ADLAEHLVRT 


GQCRHLLLLP 


GDGELEEMAE 


2551 


ELRGIiGATVD 


LSTADPADPT 


ALAEWAAVE 


GDHPLTGVIH 


ATGWDAFDP 


2601 


GDSASDLMID 


SASDSFAEAW 


SSRAGVTAAL 


HTATAHLPLD 


LFAVLSPAGA 


2651 


DLGIARSAAA 


AGADAFSAAL 


ALRRHTTVTT 


DTTAPPRTTA 


PPRTTASPRT 


2701 


TALSSSRTTG 


VALAYGPPTA 


PRPGIKGTAP 


GRIPVLLDAA 


RAHGGGSPLIi 


2751 


GARLAARALA 


AESAAEGVAG 


LPAPLRALAV 


AAAAAGAPTR 


RTAADRKPPA 


2801 


DWPARLAPLS - 


APEQLRLLID 


AVRTHAAAVL 


GRTDPEALRG 


DATFKQLGLD 


2851 


SLTAVELRNR 


LVEDTGLRLP 


TAL VFR Y PT P 


AAIAAHLRER 


LTSPSETTAT 


2901 


QRSGGQTPAA 


GQASSALAPG 


GSAAGPPAAD 


TVLSDLTRME 


NTLSVIiAAQL 


2951 


PHTETGEITT 


RLEALLTRWK 


TTNATANDSG 


DGNGGDDDAA 


ERLKAASADQ 


3001 


IFDFIDNELG 


VGHGTSRVTP 


TPKAG* 
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SUBSmUifc ;>HE£T (RULE 26) 



^1 SEPTEMBER 2000 



MonAII, polyketide synthase multi-enzyme MONS2, housing extension 
module 2 Length: 2239 amino acids 



1 


MASEEQLVEY 


LRRVTTELHD 


TRRRLVQEED 


RRQEPVALVG 


MACRFPGGVA 


51 


SPEDLWDLVA 


AGKDAIEDFP 


TDRGWDLEAL 


YDPDPAAYGT 


SYVRHGGFVD 


101 


DAGSFDADFF 


GISPREALAM 


DPQQRLMLET 


SWELFERAGI 


EPVSLKGSRT 


151 


GVYAGVSSED 


YMSQLPRIPE 


GFEGHATTGS 


LTSVISGRVA 


YNYGLEGPAV 


201 


TVDTACSASL 


VAIHLASQAL 


RQRECDLALA 


GGVLVLSSPL 


MFTEFCRQRG 


251 


LAPDGRCKPF 


AAAADGTGFS 


EGIGLLLLER 


LSDARRNGHK 


VLAVIRGSAV 


301 


NQDGASNGLT 


APNDAAQEQV 


IRAALDNARL 


TPSEVDAVEA 


HGTGTKLGDP 


351 


lEAGALLATY 


GQHRARPLLL 


GSLKSNIGHT 


HATAGVAGVI 


KTVMAIRNGL 


401 


LPATLHVEEL 


SPHVDWDAGA 


VEWTEPTPW 


PETGHPRRAG 


VSAFGISGTN 


451 


AHLILEEAPP 


EEDVPAPVW 


ESGGWPWW 


SGRTPEALRE 


QARRLGEFVA 


501 


GDTDALPNEV 


GWSIiATTRSV 


FEHRAWVGR 


DRDALTAGLG 


ALAAGEASAG 


551 


WAGVAGDVG 


PGPVLVFPGQ 


GAQWVGMGAQ 


LLDESAVFAA 


RIAECERALS 


601 


AHVDWSLSAV 


LRGDGSELSR 


VEWQPVLWA 


VMVSLAAVWA 


DYGVTPAAVI 


651 


GHSOGEMAAA 


CVAGALSLED 


AARIVAVRSD 


ALRQLQGHGD 


MASLSTGAEQ 


701 


AAELIGDRPG 


WVAAVNGPS 


STVISGPPEH 


VAAWADAEA 


RGLRARVIDV 


751 


GYASHGPQID 


QLHDLLTERL 


ADIRPTNTDV 


AFYSTVTAER 


LTDTTALDTD 


801 


YWVTNLRQPV 


RFADTIEALL 


ADGYRLFIEA 


SAHPVLGLGM 


EETIEQADMP 


851 


ATWPTLRRD 


HGDTTQLTRA 


AAHAFTAGAD 


VDWRRWFPAD 


PAPRTIDLPT 


901 


YAFQRRRYWL 


ADTVKRDSGW 


DPAGSGHAQL 


PTAVALADGG 


WLNGRVSAE 


951 


RGGWLGGHW 


AGTVLVPGAA 


LVEWVLRAGD 


EAGCPSLEEL 


TLQAPLVLPE 



1001 SGGLQVQWV GAADEQGGRR DVHVYSRSEQ DASAVWQCHA VGELGRASVA 
1051 RPVRQAGQWP PAGAEPVEVG GFYEGVAAAG YEYGPAFRGL RAMWRHGDDL 
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SUBSTITUTE SHEET (RULE 26) 



^ 1 1 SEPTEMBER 2000 



1101 


LAEVELPEEA 


GSPAGFGIHP 


ALLDAALHPL 


LAQRSRDGAG 


AGAHGGQVLL 


1151 


PFSWSGVSLW 


ASEATTVRVR 


LTGLGGGDDE 


TVSLTVTDPA 


GGPWDVAEL 


1201 


RLRSTSARQV 


RGSAGPGADG 


LYELRWTPLP 


EPLPVPAPAN 


GRDVAADLSG 


1251 


CAVLGELVAE 


PGPGIDLEGC 


PCYPGVGALA 


DNASPPSMIL 


APVHSDTTGG 


1301 


DGLiALTERVL 


RVIQDFIJ^P 


SLEQKQTRLA 


FVTRGAADTG 


STTGGSAAPA 


1351 


EAVDPAVAAV 


WGLVRSAQSE 


NPGRFVLLDT 


DAPLDQASVA 


"tLVDAVRSAV 


14 oa 


EADEPQVALR 


GGRLLVPRWA 


RAGEPVELAG 


PAGARAWRLV 


GGDSGTLEAV 


1451 


VAEACDDIVL 


rplapgqvrV 


AVHTAGVNFR 


DVLIALGMYP 


DPDALPGTEA 


1501 


AGWTEVGPG 


VTRLSVGDRV 


MGMMDGAFGP 


WAVADARMLA 


PVPPGWGTRQ 


1551 


AAAAPAAFLT 


AWYGLVELAG 


LKAGERVLIH 


AATGGVGMAA 


VQIARHVGAE 


1601 


VFATASPGKH 


AVLEEMGIDA 


AHRASSRDLA 


FEDAFRQATD 


GRGVDWLNS 


1651 


LTGELLDASL 


RLLGDGGRFV 


EMGKSDPRDP 


ELVALEHPGV 


SYEAFDLVAD 


1701 


AGPERLGLML 


DRLGELFAGG 


SLVPLPVTAW 


PLGRAREALR 


HMSQARHTGK 


1751 


LVLDVPAPLD 


PDGTVLVTGG 


TGTIGAAVAE 


HLARTGESKH 


LLIVSRSGPA 


1801 


AHGAEELVSR 


lAEFGAEATF 


VAADVSEPDA 


VAALIEGIDP 


AHPLTGWHA 


1851 


AGVLDNALIG 


SQTTESLTRV 


WAAKAAAAQQ 


LHEATRESRL 


GLFVMFSSFA 


1901 


STMGTPGQAN 


YSAANAYCDA 


LAALRRAEGL 


AGLSVAWGLW 


EATSGLTGTL 


1951 


SAADRARIDR 


YGIRPTSAAR 


GCALLAAARA 


HGRPDLIiAMD 


LDARVPAASD 


2001 


APVPAVLRTL 


A7U\GAPATAR 


PTAAAAADGA 


TDWSGRLAGL 


TEEARLELLT 


2051 


E LVC THAAGV 


T /^TTTV T^TV TV T 7/^ 

IjGHADACjAVy 




Uo J-i X/A. V iiJ-ir\-L>J 




2101 


PAALVFDYPQ 


ARVLAAHLAE 


RLVPEGAGAM 


GGVSGAEGVR 


DAYGAGGPGG 


2151 


DMTAQVLLEV 


ARVEHTLSAA 


VPHGLDRAAV 


AARLEALIiAR 


CTATTAATGA 


2201 


AGAAVEGDGD 


SDGDGAVDQL 


ETATAEQVLD 


FIDNELGV* 





MonAIII, polyketide synthase multi-enzyme MONS3, housing extension 
modules 3 and 4 Length: 4133 amino acids 
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VGl^GBcSJc/^flcM 7 2 
1 1 SEPTE:MRER 2QO0 



1 


MVSEEKLVDY 


LKRVSADLHA 


TRQRLREAEE 


RGQEPVAWE 


AACRYPGGIR 


51 


TPEDLWDLVA 


AGGNALGAFP 


DNRGWDLRRL 


FHPDPDHPGT 


TYAREGGFLH 


101 


DADLFDPEFF 


GISPREAAVL 


DPQQRLLLEC 


AWEALERAGI 


DPRSLQGSRT 


151 


GVYAGAALPG 


FGTPHIDPAA 


EGHLVTGSAP 


SVLSGRLAYT 


FGLEGPAVTI 


201 


DTACSSSLVA 


VHLAAHALRQ 


RECDLAIiAGG 


VTVMTTPYVF 


TEFSRQRGLA 


251 


ADGRCKPFAA 


AADGTAFSEG 


AGLLVLERLS 


DARRAGHRVL 


AVIRGSAVNQ 


301 


DGASNGLTAP 


NGPAQQRVIR 


AALAGARLSP 


AEVDAVEAHG 


TGTRLGDPIE 


351 


ADALLATYGQ 


ERHGGRPLWL 


GS VKSNIGHT 


QGAAGAAGLI 


KMVQALRHET 


401 


LPATLYADEP 


TPHADWESGA 


VRLLSAPVAW 


PRGEHGEHTR 


RAGISSFGIS 


451 


GTNAHLILEE 


APAADAEGAG 


GDGDGDGGGV 


RPWRVGATG 


PREEQGQGQG 


501 


QEQHQQQRQQ 


RQRSSMMPTP 


HLPWLLSARS 


PAALRAQADA 


LANHVAHADH 


551 


SIADIGGTLL 


RRTLFEHRAV 


VLGTDRDERA 


AALAALAAGR 


AHPALTRT^G 


601 


PARNGGTAFL 


FTGQGSQRPG 


MGRQLYDTFD 


VFAESLDETC 


ARLDPLLEQP 


651 


LKPVLFAPAD 


TAQAAVLHGT 


GMTQAALFAL 


EVALYRQVTS 


FGIAPSHLTG 


701 


HSVGEIAAAH 


VAGVFSLADA 


CTLVAARGRL 


MQALPAGGAM 


LAVQAAEDDV 


751 


LPLLAGQEER 


LSIiAAVNGPT 


AVWSGEAAA 


VGEVEKALRG 


RGLKTKRLNV 


801 


SHAFHSPLIE 


PMLDDFREVA 


RGLTFHAPTL 


PWSNLTGRL 


ADAELMADAE 


851 


YWVRHVRRPV 


RFHDGLRALS 


EQGWRYLEL 


GPDPVLATMV 


QDGLPAPAEG 


901 


EEPEPWAAA 


LRSKHDEGRT 


LLGAVAALHT 


DGQPADLTAL 


FPADAGQVPL 


951 


PTYRFQRRRY 


WRVAPDAAAP 


ARAAGLQETG 


HPLLPAVIRQ 


ADGGILIiAGR 


1001 


LSLRTHPWLA 


DHTIAGGVPL 


PATAFVELAL 


liAGRHAACDT 


IDDLTLETPIj 


1051 


LLDDTGTGVG 


AAVGAGADAL 


VDAIEVQLAL 


GAPDGSGRRA 


LTVHSRPADD 


1101 


AADDGDAADA 


ADAAGRGGPG 


GSGDLGDPGD 


PGDLGDGGGS 


RGWRRHATGI 
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SUBSTJTUTH SHEET (RULE 26) 



PCJZ.GB 0 0 / 0 2 0 7 2 
1 1 SEPTEMBER 2000 



1151 LSAGPAAEPA APDAAPWPPA DATALDVDAL YARLDAQGYS YGPAFRAVHA 

1201 AWRHGDDLYA DVRLADEQRA EADAFALHPA LLDAALHAVD ELYRGSEGRG 

1251 QEQGQGGQEP EQGRGDADAP VRLPFSFSDI RHHATGATRL WVRLSPQGDD 

1301 RLRLSLTDGE GGQVATVDAL QLRLIPADRW RAARPTTAAP LYHLDWHELP 

1351 LPEPAETDPA AHSWAVLGAH DAGLAPAAHY PDLAALKAAV EAGEPVPDIV 

1401 FAPFPAQGTE TDVPAQVRAH ARHALELLRD WLTTEAFAAA "feLWLTTGAV 

1451 TARPEDGPAD LATAPVWGLV RAAQAEQPDH WLVDIDKDI DKDTDEETDQ 

1501 ATDAGTASRH ALPAALAAAA AQAETQLALR AGTVLVPRLA WPPRTDTPA 

1551. LHATAPESTT DTVDSTGIAG AAESGGTVLI TGGTGGLGQA VARHLAAAHG 

1601 ARHLLLVSRR GDAAEGVAEL RADLADDGVD VRVAACDITD RDALAGLLAD 

1651 IPAAHPLTAV VHTAGVIDDS LITAMTPERL DAVLAPKADA AWHLHELTRD 

1701 KDLSAFVLFS SGASVLGNGG QANYAAANTF LNTLAEHRRA AGLAATSVAW 

1751 GLWESASGGM AARLGDADRA RIHRTGVTGL TDEQALALFD AALTAEHPTV 

1801 LATRFDRAVL RGQAAARTLQ PALRGLVRTP RPTASAGAIG STAATGSATD 

1851 ENAPSSWAAR LARLSAADRD RALNELIREQ lATVLAHPSP DTIELGRAFQ 

1901 ELGFDSLTAL ELRNRLSTAT GIRLPATLVF DHPSPTALVR HLHSHLPDEA 

1951 QHTSPTAPGA SAEGTAATAT GIDDDPIAIV GMACRYPGGV TSPEQLWQLV 

2001 ATGTDAIGPF PEDRGWDTAG LFDPDPDQVG HSYTREGGFL YDAARFDAGF 

2 051 FGISPREAAA TDPQQRLLLE TAWQAFEHAG IDPAALRGTP CGVITGIMYD 

2101 DYGSRFLARK PDGFEGRIMT GSTPSVASGR VAYTFGLEGP AITVDTACSS 

2151 SLVAMHIAAQ ALRQGECEIiA LAGGVTVMAT PNTFVEFSRQ RGLAPDGRCK 

2201 PFAAAADGTG WGEGAGLWL ERLSDARRKG HRVLALLRGS AVNQDGASNG 

2251 MTAPNGPSQE RVIRTALAGA GRGPEDIDW EAHGTGTTLG DPIEAQALLA 

2301 TYGQGRPEDR PLWLGSVKSN IGHTQAAAGV AGVIKMVMAL RHEQLPTTLH 
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W 1 1 SEPTEMBER 2000 



2351 


ADEPTPHVQW 


DGGGVRLLTE 


PVPWSRGERT 


RRAGVSSFGI 


SGTNAHLILE 


2401 


EPPEEDLPEP 


VAAEPGGWP 


WWSGRTPDA 


LREQARRLGE 


FWGAGDVSA 


2451. 


AEVGWSIiATT 


RSVFEHRAW 


AGRDRDDLVA 


GMQALAAGET 


PTDWSGAAA 


2501 


SSGAGPVLVF 


PGQGSQWVGM 


GAQLLDESPV 


FAARIAECEQ 


ALSAYVDWSL 


2551 


SDVLRGDGSE 


LSRVEWQPV 


LWAVMVSLAA 


VWADYGVTPA 


AWGHSQGEM 


2601 


AAACVAGALS 


LEDAARIVAV 


RSDALRQLQG 


HGDMASLGTG 


AEQAAELIGD 


2651 


RPGVWAAVN 


GPSSTVISGP 


PEHVAAWAE 


AEARGLRARV 


IDVGYASHGP 


2701 


QIDQLHDLLT 


EGLiADIRPAN 


TDVAFYSTVT 


AERLTDTTAL 


DTDYWVTNLR 


2751 


QPVRFADTIE 


ALIiADGYRLF 


lEASAHPVLG 


LGMEETIEQA 


DIPATWPTL 


2801 


RRDHGDTTQL 


TRAAAHAFTA 


GADVDWRRWF 


PADPTPRTVD 


LPTYAFQHQH 


2851 


YWLEEPSGLT 


GDAADLGMVA 


AGHPLLGACV 


EliAESDSYLF 


TGRLSRRAPS 


2901 


WLAEHWAGT 


VLVPGAALVE 


WVLRAGDEAG 


CPTIEELTLQ 


APLVLPESGG 


2951 


LQVQVWGAT 


DEQSGRRDVH 


VYSRSEQDAS 


AVWVCHAVGV 


VSSEMPEAAA 


3001 


ELSGQWPPAG 


AEAVDVEDFY 


ARAAEAGYAY 


GPAFQGLRAL 


WRHGTELFAE 


3051 


WLPEQAGGH 


DGFGIHPALL 


DAAIiHPLMLL 


DRPADGQMWL 


PFAWSGVSLN 


3101 


ADRATHVRVR 


LSPRGEAAER 


DLRWIADAT 


GAPVLTVDAL 


TLRAADPGRL 


3151 


GAAARGGVDG 


LYTVDWTPLP 


LPQPLPLPRT 


DAGGSADWVI 


LSDNSSAALA 


3201 


DAVSSATAAG 


GGAPWALLAP 


VGGGSADDGL 


PWRRTLSLV 


QEFLAAPELT 


3251 


ESRLVIVTRG 


AVATDADGDV 


AASAAAVWGL 


IRSAQSENPG 


RFVLLDVEEE 


3301 


HLHPDGGELP 


YAALRHAVEE 


LDEPQIiALRS 


GKFLVPRMTP 


AAAPEELVPP 


3351 


VGTSGWRLGT 


SGTATLENLS 


VIDAPEAFAP 


LEPGQVRISV 


RAAGMNFRDV 


3401 


LIALGMYPDK 


GTFAGSEGAG 


HVTEVGPGVT 


HLSVGDRVMG 


LFEGAFAPLA 


3451 


VADARMWPI 


PEGWSFQEAA 


AVPWFLTAW 


YGLVDLGRLR 


AGESLLIHAG 


3501 


TGGVGMAATQ 


lARHLGAEVF 


ATASPAKHGV 


LDGMGIDAAH 


RASSRDLDFE 
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3551 


ETLRAATGGR 


GMDWLNSLA 


GEFTDASLRL 


liAEGGRMVDM 


GKTDKRDPDR 


3601 


VAAEHAGAWY 


RAFDLVPHAG 


PDRIGEMLAE 


LGELFASGAL 


APLPVQTWPL 


3651 


GRAREAFRFM 


SQAKHTGKLV 


LEIPPALDPD 


GTVLITGGTG 


VLAAAVAEHL 


3701 


VREWGVRHLL 


LAGRRGSEAP 


GSSELAEELT 


ELGAEVTFAA 


ADVSDPDAVA 


3751 


ELVGKTDPAH 


PLTGVIHAAG 


VLDDAWTAQ 


TPESLARVWA 


AKATAAHLLH 


3801 


EATREARLGL 


FLVFSSAAAT 


LGSPGQANYA 


AANAYCDALV ^QRRAEGLAG 


3851_ 


LSIGWGLWQT 


ASGMTGHLGE 


TDLARMKRTG 


FTPLTTEGGL 


ALLDAARAHG 


3901 


RPHWAVDLD 


aravaaqpaIp 


SRPALLRALA 


AGATPGARTA 


RRTAAAGSVA 


3951 


PAGGLADRIiA 


GLPHPERRRL 


LLDLVRGNVA 


GVLGHSDHDA 


VRPDTSFKEL 


4001 


GFDSLTAVEL 


RNRLAAATGL 


KLPAALVFDY 


PESATLVDHL 


LERLSPDGAP 


4051 


PPVKDAADPV 


LNDLGRIESS 


LDALALDADA 


RSRVTRRLNT 


LLSKLNGAAT 


4101 


AGSPADVTDL 


DALDALDDVS 


DDEMFEFIDR 


EL* 





MonAIV, polyketide synthase multi-enzyme MONS4, housing extension 
modules 5 and 6 Length: 4039 amino acids 



1 MSSAEESSPD VSGTGVSGTG ESATGTSSTE AKLRQYLKRV TVDLGQARRR 



51 


LREVEERAQE 


PIAIVSMACR 


FPGDTRTPEA 


LWDLVAEGGD 


AIDDFPTNRG 


101 


WDLESLYHPD 


PDHPGTSYVR 


RGGFLYDAPA 


FDASFFGISP 


REAIiAMDPQQ 


151 


RVLMETAWQL 


LERAGIDPAS 


LKLSATGVYI 


GAGVLGFGGA 


QPDKTVEGHL 


201 


LTGSAIiSVLS 


GRISFTLGLE 


GPSVSVDTAC 


SSSLVSMHLA 


AQALRQGECD 


251 


LAIAGGVTVM 


STPGAFTEFS 


RQGALSPDGR 


SKAFAASADG 


TGFSEGAGLL 


301 


LLERLSDARR 


NGHKVTiAVIR 


GSAVNQDGAS 


NGLTAPNGPS 


QERVIRAALA 


351 


NAGLGAAEVD 


AVEAHGTGTK 


LGDPIEAGAL 


LATYGRDRDE 


DRPLWLGSVK 


4 01 


SNIGHPQGAA 


GVAGVIKMVM 


ALQRELLPAT 


LYVDEPTPHV 


DWSSGSVRLL 


451 


TEPVPWTRGE 


RPRRAGVSAF 


GMSGTNAHVI 


LEEAPPEEAA 


AAETPAEGTG 


501 


AWPWWSGR 


GEEALRAQAA 


QLAEHVRDDD 


QRPASPLEVG 


WSLATTRSVF 
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SUBSTITUTe SHCET (RULE 26) 



1 1 SEPTFIVIBFR 2000 

551 ENRAVWGDD RDALLDGLRS LAAGEASPDV VSGAVGPTGP GPVMVFPGQG 

601 GQWVGMGARL LDESPVFAAR lAECEQALSA YVDWSLTDVL RGDGSELARI 

651 DWQPVLWAV MVALAAVWAD QGIEPAAWG HSQGEIAAAC WGAISLDEA 

701 ARIVAVRSVL LRQLSGRGGM ASLGMGQEQA ADLIDGHPGV WAAVNGPSS 

751 TVISGPPEGI AAWADAQER GLRARAVASD VAGHGPQLDA ILDQLTEGLA 

801 GIRPAATDVA FYSTVTAGHL TDTTELDTAY WVRNVRRTVR FADTIDALLA 

851 _DGYRLFIEVS PHPVLNLALE GLIERAAVPA TWPTLRRDH GDTTQLARAA 

901 AHAFAAGADV DWRRWFPADP APRTVDLPTY AFQRQDFWPA PAGGRSGDPA 



951 


GLGLAASGHP LLGASVGLAS GDVHLLSGRV SRQSAAWLDD HWAGQALVP 


1001 


GAAQVEWVLR 


AGDDAGCSAL 


EELTLQTPLV 


LPDTGGLRIQ 


VWEAADAHG 


1051 


RRDVRLFSRP 


DDDDAFASTH 


PWTCHATGVL 


APAPTDGTNG 


TRDAADTLDG 


1101 


AWPPADAEPV 


PADDLYAQ7UD 


RTGYGYGPAF 


RGVRALWRHG 


KDVIiAEVTLP 


IXbl 


KEAGDPDGFG 


IHPALLDAVL 


QPAALLLPPT 


DAEQVWLPFA 


WNDVALHAVR 


1201 


ATTVRVRLTP 


LGERIDQGLR 


ITVADAVGAP 


VLTVRDLRSR 


PTDTGRLAAA 


1251 


ATRDRHGLFD 


LEWIAPENAA 


ENAAGPARDA 


SEGWVTLGED 


AASLADLIiAS 


1301 


VEAGAPAPQL 


VAAPVEPDRT 


DDGLAIiATHV 


LDLVQTWIiAS 


PLHDSRLVLV 


1351 


TRGAVTDADV 


DVAAAAVWGL 


VRSAQSEHPG 


RFTLIDLGPD 


DTLAAAMQAA 


1401 


HLEEPQLAVH 


GGEIRVPRLV 


RATTDPTAPN 


GTPEADRTAD 


PSEGLHRNGT 


1451 


VLITGGTGVL 


GRLVAEHLVT 


EWGVRHLLLA 


SRRGDQAPGS 


AELRARLSEL 


1501 


GASVEIAPAD 


VGDAEAVAAL 


lASVDPAHPL 


TGVIHAAGVL 


DDAVITAQTP 


1551 


ESLARVWATK 


ATAARHLHEA 


TRETPLDFFV 


VFSSAAASLG 


SPGQANYAAA 


1601 


NAYCDALVQH 


RRAQGLAGLS 


lAWGLWQATS 


GMTGQLSETD 


LARMKRTGFA 


1651 


ALTDEGGLAL 


LDAARAHDRA 


YWAADLDPR 


AVTDGLSPLL 


RALTAPATRR 


1701 


RVASEGLADG 


AliATRIiAGLD 


ADGRLRLLTD 


WREYVAAVL 


GHGSAARVGV 



-83- 



SUBSTITUTE SHEET (RULE 26] 



1 1 SEPTEi\^BER 2000 



1751 


DIAFKDLGFD 


SLTAVELRNR 


LSAACDVRLP 


ATLIFDHPTP 


QAIiATHLVDR 


1801 


LAGSTSATTT 


VNATAPAAAH 


VAAGADVDAD 


TDDPVAIVAM 


TCRFPGGVAS 


1851 


PDDLWDLIiDA 


RKDAMGAFPT 


DRGWDLERLF 


HPDPDHPGTS 


YTDQGGFLPD 


1901 


AGDFDAAFFG 


INPREAIiAMD 


PQQRLLLEAS 


WEVLERAGID 


PTTLKGTPTG 


1951 


TYVGLMYHDY 


AKSFPTADAQ 


LEGYSYLAST 


GSMVSGRVAY 


TLGLEGPAVT 


2001 


VDTACSSSLV 


SIHLATQALR 


HGECDLALAG GVTVMADPDM *FAGFSRQRGL 


2051 


SPDGRCKAYA 


AAADGVGFSE 


GVGVLLLERL 


SDARRHGRRV 


LGWRGSAVN 


2101 


QDGASNGLTA 


PNGPSQER^n: 


RQALASGGLS 


SVDVDWEGH 


GTGTTLGDPI 


2151 


EAQALIiATYG 


QGRPEDRPLW 


LGSVKSNIGH 


TQAAAGVAGV 


IKMVMAMRHG 


2201 


WPASLHVDV 


PSPHVEWDSG 


AVRLAVESVP 


WPQVEGRPRR 


AGVSSFGASG 


2251 


TNAHVIVESV 


PDGLEEDSVS 


VGGEALETET 


DGRLVPWWS 


ARSPQALRDQ 


2301 


ALRLRDFASD 


ASFRAPLADV 


GWSLLKTRAL 


HEHRAWVGA 


ERAELIAALE 


2351 


ALATGEPHAA 


LVGPACSQAR 


VGGDDWWLF 


SGQGSQLVGM 


GAGLYERFPV 


2401 


FAAAFDEVCG 


LLEGPLGVEA 


GGLREWFRG 


PRERLDHTVW 


AQAGLFALQV 


2451 


GLARLWESVG 


VRPDWLGHS 


IGEIAAAHVA 


GVFDLADACR 


WGARARLMG 


2501 


GLPEGGAMCA 


VQATPAELAA 


DVDGSAVSVA 


AVNTPDSTVI 


SGPSDEVDRI 


2551 


AGVWRERGRK 


TKALSVSHAF 


HSALMEPMLA 


EFTEAIRGVK 


FRQPSIPLMS 


2601 


NVSGERAGEE 


ITDPEYWARH 


VRNAVLFQPA 


lAQVADSAGV 


FVELGPAPVL 


2651 


TTAAQHTLDE 


SDSQESVLVA 


SLAGERPEES 


AFVEAMARLH 


TAGVAVDWSV 


2701 


LFAGDRVPGL 


VELPTYAFQR 


ERFWLSGRSG 


GGDAATLGLV 


AAGHPLLGAA 


2751 


V X_i i. x^J V— . J— 1 


TjTnPI.SRSGV 

XJ d. V_jX\.J— lO LX-O v_J V 


SWLADHWAG 


AVLVPGAALV 


EWALRAGDEV 


2801 


GCVTVEELML 


QAPLVVPEAS 


GLRVQVWEE 


AGEDGRRGVQ 


lYSRPDADAV 


2851 


GGDDS WICHA 


TGVLSPESAR 


LDTELGGVWP 


PAGAEPLDVD 


GFYAQAGEAG 


2901 


YGYGPAFRGL 


RAVWRHGQDL 


liAEWLPEAA 


GAHDGYGIHP 


ALLDATLHPL 
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SUBSTITUTE SHEET (RULE 76) 



^ ^ SEPTci^^nEn 2000 



2951 


LAARFMDGSE 


DDQLYVPFGW 


AGVSLRAVGA 


TTVRVRLRPV 


GESVDQGLSV 


3001 


TVTDATGGPV 


LSVDSLQTRP 


VKPSQIiAAAQ 


QPDVRGLFTV 


EWTPLPQTDA 


3051 


DGEADWWLS 


DGVGRLADW 


SAAGGEAPWA 


WAPVDASVG 


DGREGLDGRL 


3101 


WERVLSLVQ 


EFLALPELAE 


SRLLWTRGA 


VATGVDGDGD 


VDASAAAVWG 


3151 


LVRSAQSENP 


GRFILLDVDG 


DGDDQGPDLN 


GRHLPHATLR 


HAAEELDEPQ 


3201 


LALREGTLYV 


PRLTQARQSA 


ELWPPGEPA 


WRLRMVHDGS 


LDALAAVACP 


3251 


EALEPLAPGQ 


VRIAVHAAGI 


NFRDVLVALG 


MVPAYGAMGG 


EGAGWTEVG 


3301 


PEVTHVSVGD 


RVMGVFEGAF 


GPWIAEARM 


VTPVPQGWDM 


REAAGIPAAF 


3351 


LTAWYGLVEL 


AGLKAGERVL 


VHAATGGVGM 


AAVQIARHVG 


AEVFATASPG 


3401 


KHAVLEEMGI 


DAAHRASSRD 


LAFEGTFREA 


TGGRGMDWL 


NSIiAGEFIDA 


3451 


SLRLLGDGGR 


FLEMGKTDVR 


AAEEVAAEHA 


DVSYTAYDLV 


GDAGPDRISN 


3501 


MLDKLVELFA 


SERLKPLPVR 


SWPLDKAQEA 


FRFMSQAKHT 


GKLVLEIPPA 


3551 


LDPEGTVLVT 


GGTGALGQW 


AEHLVREWGV 


RHLLIiASRRG 


PEAPGSDEIjA 


3601 


SKLTGLGAEV 


TIVAADVSDP 


ASWELVGKT 


DPSHPLTGW 


HAAGVLEDGV 


36-51 


VTAQTPEGLiA 


RVWAAKAAAA 


ANLHEATREM 


RLGLFWFSS 


AAATLGSPGQ 


3701 


ANYAAANAYC 


DALMQHRRAV 


GQVGLSVGWG 


LWEAPDAKPG 


VAADAKASAA 


3751 


TVGKASALSD 


GTNGSAPQDT 


TGTAPQGMTG 


GLTDTDVARM 


ARIGVKGMSN 


3801 


AHGLALFDAA 


HRHGRPHLVG 


FNLDLRTLAT 


HPLHTRPALL 


RGIiATPTAGG 


3851 


ASRPTATAGG 


QPADLAGRLA 


ALSPSDRHHT 


LVRLIREQAA 


TVLGHHPDSL 


3901 


TTGSTFKELG 


FDSLTAVELR 


NRLSAATGLR 


LiPAGLVFDHP 


DADIIiAEHLG 


3951 


AQLAPDGDTP 


AGAEATDPVL 


RDIiAKLENAL 


SSTLVEHLDA 


DAVTARLEAL 


4001 


LSNWKAASAA 


PGSGSTKEQL 


QVATTDQVLD 


FIDKELGV* 




MonAV, polyketide 


synthase multi-enzyme MONS5, housing extension 



modules 7 and 8 Length: 4107 amino acids 
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SUBSTITUTS SHE-TT (RULE 26) 



1 1 SFPTEWPFR 2009 



1 


MASEEELVDY 


LKRVAAELHD 


TRQRLREVED 


RRQEPVAWG 


MACRFPGGIE 


51 


TPEGLWELVA 


AGDDAIEPFP 


TDRGWDLEGI 


YHPDPDHPGT 


CYVREGGFLA 


101 


APDRFDSDFF 


GFSPREAIiAS 


SPQLRLLLET 


SWEALERAGI 


NPASLKGSPT 


151 


GVYVGAATTG 


NQTQGDPGGK 


ATEGYAGTAP 


SVLSGRLSFT 


LGLEGPAVTV 


201 


ETACSSSLVA 


MHIiAANALRQ 


GECDLALAGG 


VTVMSTPEVF 


TGFSRQRGIiA 


251 


PDGRCKPFAA 


AADGTGWGEG 


AGLILLERLS 


DARRKGHKVL 


AVIRGSAINQ 


301 _ 


DGASNGFTAP 


NGPSQRRVIR 


QALSST^LST 


SEIDWEAHG 


TGTRLGDPIE 


351 


AEALIATYGK 


ereddrplwlT GSVKSNIGHT 


QAAAGVAGVI 


KMVMALQREL 


401 


LPATLNVDEP 


TPHVQWEGGG 


VRLLTEPVPW 


SRGERPRRAG 


ISSFGISGTN 


451 


AHWLEEAPP 


EEDVPGPVAA 


EPEGWPWW 


SARTEEALSE 


QARRLGEFVA 


501 


DTDPSTADVG 


wslttsrail 


EHRAVWGRD 


RDALTAGLAA 


LAAGEESADV 


551 


VAGVAGDVGP 


GPVLVFPGQG 


SQWVGMGAQL 


LDESPVFTVAR 


lAECEQALSA 


601 


YVDWSLSAVL 


rgdgselsrv 


EWQPVLWAV 


MVSIjAAVWAD 


YGVTPAAVIG 


651 


HSQGEMAAAC 


VAGALSLEDA 


ARWAVRSDA 


LRQLMGQGDM 


ASLGASSEQA 


701 


AELIGDRPGV 


CIAAVNGPSS 


TVISGPPEHV 


AAWADAEER 


GLRARVIDVG 


751 


YASHGPQIDQ 


LHDLLTDRLiA 


DIRPATTDVA 


FYSTVTAERL 


TDTTALDTDY 


801 

o \y JL 


WVTNLiRO PVR 


FADTIDALLA 


DGYRLFIEAS 


AHPVLGLGME 


ETIEQADIPA 


851 


TWPTLRRDH 


GDTTQLTRAA 


AHAFTAGATV 


DWRRWFPADP 


TPRTIDLPTY 


901 


AFQRRSYWLP 


VDGVGDVRSA 


GLRRVEHSLL 


PAALGLADGA 


LVLTGRLAAS 


951 


GGGGGWIiADH 


AVAGTTLVPG 


AALVEWALRA 


ADEAGCPSLE 


ELTLQAPLVL 



1001 PGSGGLQVQV WGPADGQGG RREVRVFSRV DSDDEAAGQD EGWSCHATGV 

1051 LSPEPGAVPD GLSGQWPPTG AEPLEISDLY EQAASAGYEY GPSFRGLRSV 

1101 WRHGHNLLAE VELPEQAGAH DDFGIHPVLL DAALHPALLL DQNAPGEEQE 

1151 PAQPALRLPF VWNGVSLWAT GAATVRVRLA PHGGGETDDS AGLRVTVADA 
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SUBSTITUTE SHEET (RULE 26) 



1 1 SEPTEI:inE" 2000 



1201 


TGAPVLSVDS 


LALRPADPEL 


LRTAGRAGSG 


TNGLFTVEWT 


ALPPADVADH 


1251 


AAGDGWAVLG 


QDVPDWAGAD 


MPRHPDMASL 


SAALDEGTQA 


PAAVFVETTA 


1301 


TSHATPNTAA 


DVTLDASGRA 


VAERTLHLLR 


DWLAEPRLAE 


TRLVLITHHA 


1351 


VTTPADDDVN 


AAPLDVPAAA 


LWGLIRSAQA 


EHPDRFVLLD 


TDAKANTDPG 


1401 


PDTSTDHSTA 


SGTYRTVIAR 


ALATGEPQLA 


VRAGELLAPR 


LARAATPTPE 


1451 


TPTPETQPDT 


GSGSEAGAGS 


GSGPGATLDP 


DGTVLIAGGT 


GMMGGLVAEH 


1501 


LVRAWSVRHL 


LLVSRQGPDA 


PDARDLADRL 


VGLGATVRIV 


AADLTDGRAT 


1551 


ADLVASVDPA 


HPLTGVIHAA 


GVLDDAWTA 


QTSDQLARVW 


AAKASVAANL 


1601 


DAATSELPLG 


LFLMFSSAAG 


VLGNAGQAGY 


AAANAFVDAL 


VGRRRATGLP 


1651 


GLSIAWGLWA 


RGSAMTRHLD 


DADLARLRAG 


GVKPLLDEQG 


LALLDAARAT 


1701 


AAHTSLWAA 


GIDVRGLNRD 


DVPAILRDLjA 


GRTRRRAAAD 


STVDQAALER 


1751 


RLTGLDEAER 


RAWTDWRE 


CVAAVLGHRS 


AADVRTEANF 


KDLGFDSLTA 


1801 


VQLRNRLSAA 


SGLRLPATLA 


FDHPTPQALA 


AYLGTRLSGR 


TATPVAPVAP 


1851 


SAAATDEPVA 


IVAMACKYPG 


GATSPEGLWD 


LVAEGVDAVG 


AFPTGRGWDL 


1901 


ERLFHPDPDH 


PGTSYADEGA 


FLPDAGDFDA 


AFFGINPREA 


LAMDPQQRLL 


1951 


LEASWEVLER 


AGIDPTTIiKG 


TPTGTYVGVM 


YHDYAAGLAQ 


DAQLEGYSML 


2001 


AGSGSWSGR 


VAYTLGLEGP 


AVTVDTACSS 


SLVSIHLAAQ 


ALRQGEGTLA 


2051 


LAGGVTVMAT 


PEVFTGFSRQ 


RGLAPDGRCK 


PFAAAADGTG 


WGEGVGVLLL 


2101 


ERLSDARRHG 


RRVLGWRGS 


AVNQDGASNG 


LTAPNGPSQE 


RVIRQALASG 


2151 


GLSSVDVDW 


EGHGTGTTLG 


DPIEAQALLA 


TYGQGRPVDR 


PLWLGSVKSN 


2201 


IGHTQAAAGV 


AGVIKMVMAM 


RHGWPASLH 


VDVPSPHVEW 


DSGAVRIAVE 


2251 


SVPWPEVEGR 


PRRAGVSSFG 


ASGTNAHVIV 


ESVPDGLGED 


SVSVSGEAPE 


2301 


TETDGRLVPW 


WSARSPQAL 


RDQALRLRDA 


VAADSTVSVQ 


DVGWSLLKTR 


2351 


ALFEQRAWy 


GRERAELLSG 


LAVLAAGEEH 


PAVTRSREDG 


VAASGAWWL 
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suESTiTu<> she;:! (rule 26) 



1 1 SEPTEMBER 2000 



2401 


FSGQGSQLVG 


MGAGLYERFP 


VFAAAFDEVC 


GLLEGPLGVE 


AGGLREWFR 


2451 


GPRERLDHTM 


WAQAGLFALQ 


VGLARLWESV 


GVRPDWLGH 


SIGEIAAAHV 


2501 


AGVFDLADAC 


RWGARARLM 


GGLPEGGAMC 


AVQATPAELA 


ADVDDSGVSV 


2551 


AAVNTPDSTV 


ISGPSGEVDR 


lAGVWRERGR 


KTKALSVSHA 


FHSALMEPML 


2601 


AEFTEAIREV 


KFTRPKVSLI 


SNVSGLEAGE 


EIASPEYWAR 


HVRQTVLFQP 


2651 


GIAQVASTAG 


VFVELGPGPV 


LTTAAQHTLD 


DVTDRHGPEP VLVSSLAGER 


27ai 


PEESAFVEAM 


ARLHTAGVAV 


DWSVLFAGDR 


VPGLVELPTY 


AFQRERFWLS 


2751 


GRSGGGDAAT 


LGLVAAGHPL 


LGAAVEFADR 


GGCLLTGRLS 


RSGVSWLADH 


2801 


WAGAVLVPG 


AALVEWALRA 


GDEVGCVTVE 


ELMLQAPLW 


PEASGLRVQV 


2851 


WEEAGEDGR 


RGVQIYSRPD 


ADAVSGDDSW 


ICHATGTLTP 


QHTDAPNDGL 


2901 


AGAWPAAGAV 


PVDLAGFYER 


VADAGYAYGP 


GFQGLRAVWR 


HGQDLLAEW 


2951 


LPEAAGAHDG 


YGIHPALLDA 


TLHPALLLDW 


PGEVQDDDGK 


VWLPFTWNQV 


3001 


SLRAAGAATV 


RVRLSPGEHD 


EAEREVQVLV 


ADATGTDVLS 


VGSVTLRPAD 


3051 


IRQLQAVPGH 


DDGLFSVDWT 


PLPLSRTDVS 


QTDADGDADW 


WLSDGVGSL 


3101 


ADWSAAGGE 


APWAWAPVG 


ASAGGGLAGF 


DRREGLDGRL 


WERVLSLVQ 


3151 


EFIiAAPELAE 


SRLLVLTRGA 


VATGGDGDGD 


VDASAAAVWG 


LVRSAQSENP 


3201 


GRFILLDVDM 


DVDVDVDMDV 


DVDVDVDVDV 


DGDGNGSDLD 


PDLNGRRLPH 


3251 


ATIiRHAAEEL 


DEPQLALRDG 


QLLVPRLVRA 


TGGGLWAPT 


DRAWRLDKGS 


3301 


AETLESVAPV 


AYPGVMEPLG 


PGQVRLGIHA 


AGINFRDVLV 


SLGMVPGQVG 


3351 


LGGEGAGWT 


ETGPDVTHLS 


VGDRVMGVLH 


GSFGPTAVAD 


TRMVAPVPQG 


o /I n T 




vavT.Tawvni. 

VM.X Ij ±£\ri X V7J-J 


VFTiAniiKAGE 

V 1—1 ' ^ 1 J X\^A.V7 J— J 


RVLIHAATGG 


VGMAAVQIAR 


3451 


HIiGAEVFATA 


SAAKHWLEE 


MGIDAAHRAS 


SRDLAFEDTF 


RQATDGRGMD 


3501 


WLNSLTGEF 


IDASLRLLGD 


GGRFLEMGKT 


DVRTPEEVAA 


EYPGVTYTVY 


3551 


DLVTDAGPDR 


lAVMMSELGE 


RFASGALDPL 


PVRSWPLDKA 


REAFRFMSQA 
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1 1 SEPTEMBER 2000 


3601 


KHTGKLVLDV 


PAPLDPDGTV 


LITGGTGALG 


QWAEHLVRE 


WGVRHLLLAS 


3651 


RRGLDAPGSG 


EliADRLSDLG 


AEVTVAAADV 


SDPASWELV 


GKTDPSHPLT 


3701 


GWHAAGVLE 


DGIVTAQTPE 


GliARVWAAKA 


AAAANLHEAT 


REMRLGLFW 


3751 


FSSAAATLGS 


PGQANYAATU^ 


AYCDALMQRR 


RAAGQVGLSV 


GWGLWEAPDA 


3801 


KPGVAADAKP 


DVAADAKTGV 


AADGTPQGMT 


GTLSGTDVAR 


MARIGVKAMT 


3851 


SAHGLALLDA 


AHRHGRPHLV 


AVDLDTRVIA 


HKPAPALPAL 


LRAFAGDQGG 


3901 


QGGGRGGGRG 


GGPARPAAAT 


TRQNVDWAAK 


LSVLTAEEQH 


RTLLDLVRTH 


3951 


AAAVLGHAGT 


DAVRADAAFQ 


DLGFDSLTAV 


ELRNRLSAST 


GLRLPATFIF 


4001 


RHPTPSAIAD 


ELRAQLAPAG 


ADPAAPLFGE 


LDKLETVITG 


HAHDESTRTR 


4051 


liAARLQNLLW 


RLDDTSARSD 


HAAGASDADG 


DAVENRDLES 


ASDDELFELI 


4101 


DRELPS* 











MonAVI, polyketide synthase multi-enzyme MONS6, housing extension 



module 9 Length: 1701 amino acids 






1 


MPGTNDMPGT 


EDKLRHYLKR 


VTADLGQTRQ 


RLRDVEERQR 


EPIAIVAMAC 


51 


RYPGGVASPE 


QLWDLVASRG 


DAIEEFPADR 


GWDVAGLYHP 


DPDHPGTTYV 


101 


REAGFLRDAA- 


RFDADFFGIN 


PREALAADPQ 


QRVLLEVSWE 


LFERAGIDPA 


151 


TLKDTLTGVY AGVSSQDHMS 


GSRVPPEVEG 


YATTGTLSSV 


ISGRIAYTFG 


201 


LEGPAVTLDT 


ACSASLVAIH 


IjACQALRQGD 


CGLAVAGGVT 


VLSTPTAFVE 


251 


FSRQRGLAPD 


GRCKPFAEAA 


DGTGFSEGVG 


LILLERLSDA 


RRNGHQVLGV 


301 


VRGSAVNQDG 


ASNGLTAPND 


VAQERVIRQA 


LTNARVTPDA 


VDAVEAHGTG 


351 


TTLGDPIEGN 


ALLATYGKDR 


PADRPLWLGS 


VKSNIGHTQA 


AAGVAGVIKM 


401 


VMAMRHGELP 


ASLHIDRPTP 


HVDWEGGGVR 


LLTDPVPWPR 


ADRPRRAGVS 


451 


SFGISGTNAH 


LIVEQAPAPP 


DTADDAPEGA 


ATPGASDGLV 


VPWWSARSP 


501 


QALRDQALRL 


RDFAGDASRA 


PLTDVGWSLL 


RSRALFEQRA 


WAGRERAEL 


551 


IjAGLAALAAG 


EEHPAVTRSR 


EEAAVAASGD 


WWLFSGQGS 


QLVGMGAGLY 
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1 1 SEPTEMBER 2000 



601 


ERFPVFAAAF 


DEVCGLLEGE 


LGVGSGGLRE 


WFWGPRERL 


DHTVWAQAGL 


651 


FALQVGLARL 


WESVGVRPDV 


VLGHSIGEIA 


AAHVAGVFDL 


ADACRWGAR 


701 


ARLMGGLPEG 


GAMCAVQATP 


AELAADVDGS 


SVSVAAVNTP 


DSTVISGPSG 


751 


EVDRIAGVWR 


ERGRKTKALS 


VSHAFHSALM 


EPMLGEFTEA 


IRGVKFRQPS 


801 


IPLMSNVSGE 


RAGEEITSPE 


YWARHVRQTV 


LFQPGVAQVA 


AEARAFVELG 


851 


PGPVLTAAAQ 


HTLDHITEPE 


GPEPWTASL 


HPDRPDDVAF ' 


AHAMADLHVA 


901 


GISVDWSAYF 


PDDPAPRTVD 


LPTYAFQGRR 


FWIiADIAAPE 


AVSSTDGEEA 


951 


GFWAAVEGAD 


FQALCDTLHlT 


KDDEHRAALE 


TVFPALSAWR 


RERRERSIVD 


1001 


AWRYRVDWRR 


VELPTPVPGA 


GTGPDADTGL 


GAWLIVAPTH 


GSGTWPQACA 


1051 


RALEEAGAPV 


RIVEAGPHAD 


RADMADLVQA 


WRASCADDTT 


QLGGVLSLLA 


1101 


LAEAPATSSD 


TTSHTSTSCG 


TGSLASHGLT 


GTLTLLHGLL 


DAGVEAPLWC 


1151 


ATRGAVSCGD 


ADPLVSPSQA 


PVWGLGRVAA 


LEHPELWGGL 


VDLPADPESL 


1201 


DASALYAVLR 


GDGGEDQVAL 


RRGAVLGRRL 


VPDATPDVAP 


GSSPDVSGGA 


1251 


AHADATSGEW 


QPHGAVLVTG 


GVGHLADQW 


RWIiAASGAEH 


WLLDTGPAN 


1301 


SRGPGRNDDL 


AAEAAEHGTE 


LTVLRSLSEL 


TDVSVRPIRT 


VIHTSLPGEL 


1351 


APIiAEVTPDA 


LGAAVSAAAR 


LiSELPGIGSV 


ETVLFFSSVT 


ASLGSREHGA 


1401 


YAAANAYLDA 


LiAQRAGADAA 


SPRTVSVGWG 


IWDLPDDGDV 


ARGAAGLSRR 


1451 


QGLPPLEPQL 


ALGALRAALD 


GGKGHTLVAD 


lEWERFAPLF 


TLARPTRLLD 


1501 


GIPAAQRVLD 


ASSESAEASE 


NASALRRELT 


ALPVRERTGA 


LLDLVRKQVA 


1551 


AVTjRYE pgod 


VAPEKAFKDL 


GFDSLVWEL 


RNRLRAATGL 


RLPATLVYDY 


1601 


PTPRTIjAAHL 


LDRVLPDGGA 


AELPVAAHLD 


DLEAALTDLP 


ADDPRRKGLV 


1651 


RRLQTLLWKQ 


PDAMGAAGPA 


DEEEQAAPED 


LSTASADDMF 


ALIDREWGTR 


1701 


* 











MonH, probable regulatory protein Length: 981 amino acids 
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^SEPTEMBER 2000 



1 


VSGVERGVGS 


AGPVEQGDGL 


AGLVERAEAL 


AALRGAFDGS 


PGTGGSLWL 


51 


SGAVGTGKTA 


LLRAWADRIG 


ADADALVLTA 


TACRAERDLP 


LGVLEQLVRS 


101 


PGLPPASAER 


ALAWWDEEAS 


ATPGKTDANG 


TSANGTDANG 


TGAGQTGAGQ 


151 


AGVGQTGVGG 


EPVLAASALR 


GLCEVLRDLL 


AERPVWAVD 


DAHHADAASL 


201 


QCLLSWRRL 


RSARLHVLFT 


EYAHQKAQNA 


LLSSEFLHEP 


ALRRIRLEPL 


251 


SKAGVEALLiA 


RHLDERTAQD 


LTPWHGMSA 


GHPLLVRALA 


l^DHRAAGGAG 


301 


EAYGRAVLSF 


LYRHETPVTQ 


VARAIAALGA 


HAGPGQVGRL 


LDVDAASVER 


351 


AVRQLTVAEV 


LHEGRLCHPiT 


FAAAVLDGMP 


PEERRALHGR 


VADLLHEEGA 


401 


PATEVAAHLV 


AADRSDAPWA 


VPVFQEAAQL 


ALDEDQVETG 


VDYLRAAHQR 


451 


CRGAAQRAAV 


VGALADAEWR 


LDPAKVLRHL 


PDPAAMAPQT 


DPAALiAPHTD 


501 


PAPTAAPTAA 


PTPTPIPTTP 


PLPTHLLWHG 


RVEEGLDAIG 


TLTGPGPNPA 


551 


GAPPMNPT^L 


DTPWLWGAYL 


YPGHVKERLG 


SGALSPQRST 


PPAVTPELQG 


601 


AGTLMNDLLH 


GGERDATEAA 


ERALNRYRLG 


PRTIAVQTAA 


LAALTYRDRP 


651 


HRAAAWCDGL 


VAQADERNSP 


TWRALFTAWR 


ALLHLRQGDP 


AAAEQRAETA 


701 


LALLGSKGWG 


AAIGLPLAAA 


VQAKAALGDV 


DGAAALLERP 


VPQAVFQTRT 


751 


GLHYIiAARGR 


YHIiATGCHYA 


ALCDFYACGT 


RMSSWGVDLP 


ALEPWRLGAA 


801 


EAYIjALGEGL 


LARQLVDGQL 


PLPTPDDGRT 


WGMTLRLRAA 


TSPAPARAEL 


851 


LDEAVAVLRE 


SGDTFELARA 


VADQAVAVRE 


GGEAERARLL 


ARKAELLARR 


901 


WGSAPAPATV 


PEPPERPGPA 


TPDAELTSAE 


RRVAEIiAAEG 


FTNREISRKL 


951 


CVTVSTVEQH 


LTRIYRKLDV 


RRLDLQAALG 


* 




MonCI, flavin-dependent epoxidase Length: 496 amino acids 


1 


VTTTRPAHAV 


VLGASMAGTL 


AAHVIiARHVD 


AVTWERDAL 


PEEPQHRKGV 


51 


PQARHAHLLW 


SNGARLIEEM 


LPGTTDRLLA 


AGARRLGFPE 


DLVTLTGQGW 


101 


QHRFPATQFA 


LVASRPLLDL 


TVRQQALGAD 


NITVRQRTEA 


VELTGSGGGS 



SUBSTtTUT£ SHEET (RULE 26) 



1 1 SEPTEMRFR 2000 

151 GGRVTGVWR DLDSGRQEQL EADLVIDATG RGSRLKQWLA ALGVPALEED 
2 01 RLFKAPPGAT THFPAVNIAA DDRVREPGRF GWYPIEGGR 

2 51 WLATLSCTRG AQLPTHEDEF IPFAENLNHP ILADLLRDAE PLTPVFGSRS 

3 01 GANRRLYPER LEQWPDGLLV IGDSLTAFNP lYGHGMSSAA RCATTIDREF 
3 51 ERSVQEGTGS ARAGTRALQK AIGAAVDDPW ILAATKDIDY VNCRVSATDP 
401 RLIGVDTEQR LRFAEAITAA SIRSPKASEI VTDVMSLNAP ^AELGSNRFL 
451^ MAMRADERLP ELTAPPFLPE ELAWGLDAA TISPTPTPTP TAAVRS 

MonBII, carbon-carbon double bond isomerase Length: 141 amino acids 

1 MPDEAARKQM AVDYAERINA GDIEGVLDLF TDDIVFEDPV GRPPMVGKDD 
51 LRRHLELAVS CGTHEVPDPP MTSMDDRFW TPTTVTVQRP RPMTFRIVGI 
101 VELDEHGLGR RVQAFWGVTD VTMDDPAGPA DTTHPEGIRA * 

MonBI, carbon-carbon double bond isomerase Length: 144 amino acids 

1 MNEFARKKRA LEHSRRINAG DLDAIIDLYA PDAVLEDPVG LPPVTGHDAL 
51 RAHYEPLLAA HLREEAAEPV AGQDATHALI QISSVMDYLP VGPLYAERGW 
101 LKAPDAPGTA RIHRTAMLVI RMDASGLIRH LKSYWGTSDL TVLG 

MonAVIII, polyketide synthase multi-enzyme MONS8, housing extension 
modules 11 and 12 Length: 3754 amino acids 



1 


MSNEEKLLDH 


LKWVTAELRQ 


ARQRLHDKES 


TEPVAIVGMA 


CRYPGGARSA 


51 


EDLWELVRDG 


GDAVAGFPDD 


RGWDLESLYH 


PDPEHPATSY 


VRDGAFLYDA 


101 


GHFDAEFFGI 


SPREATAMDP 


QQRLLLETAW 


EAIEHAGMNP 


HALKGSDTGV 


151 


FTGVSAHDYL 


TLISQTASDV 


EGYIGTGNLG 


SWSGRISYT 


VGLEGPAVTV 


201 


DTACSSSLVA 


IHLASQALRQ 


GECSLAIiAGG 


STVMATPGSF 


TEFSRQRGLA 


251 


PDGRCKPFAA 


AADGTGWGEG 


AGWALELLS 


EARRRGHKVL 


AVIRGSATNQ 


301 


DGTSNGLAAP 


NGPSQERVIR 


AALANARLSA 


EDIDAVEAHG 


TGTTLGDPIE 
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351 


AQAIilATYGQ 


GRPEDRPLWL 


GSVKSNIGHT 


QAAAGVAGVI 


KMVMAMRNGL 


401 


LPTSLHIDAP 


SPHVQWEQGS 


VRLLSEPVDW 


PAERTRRAGI 


SAFGISGTNA 


451 


HLILEEAPPE 


EDAPGPVAAE 


PGGWPWWS 


GRTPDALREQ 


ARRLGEFAAG 


501 


LADASVSEVG 


WSLATTRALF 


DQRAVWGRD 


IiAQAGASLEA 


LAAGEASADV 


551 


VAGVAGDVGP 


GPVLVFPGQG 


SQWVGMGAQL 


LDESPVFAAR 


lAECEQALSA 


601 


HVDWSLSDVL 


RGDGSELSRV 


EWQPVLWAV 


MVSLAAVWAD YblTPAAVIG 


651 


HSQGEMAAAC 


VAGALSLEDA 


ARIVAVRSDA 


LRQLQGHGDM 


ASLSTGAEQA 


701 


AELIGDRPGV 


WAAVNGPSS' 


TVISGPPEHV 


AAWADAEAQ 


GLRARVIDVR 


751 


YASHGPQIDQ 


LHDLLTDRLA 


DIQPTTTDVA 


FYSTVTAERL 


DDTTALDTAY 


801 


WVTNLRQPVR 


FADTIEALLA 


DGYRLFIEAS 


PHPVLNLGIQ 


ETIEQQAGAA 


851 


GTAVTIPTLR 


RDHGDTTQLT 


RAAAHAFTAG 


APVDWRRWFP 


ADPTPRTVDL 


901 


PTYAFQHKHY 


WVEPPAAVAA 


VGGGHDPVEA 


RVWQAIEDLD 


IDALAGSLEI 


951 


EGQAESVGAL 


ESALPVLSAW 


RRRHREQSTV 


DSWRYQVTWK 


HliPDVPAPEL 


1001 


SGAWLLLVPA 


AHADHPAVLA 


TAQTLTAHGG 


EVRRHWDAR 


AMERTELAQE 


1051 


LRVLMDGAAF 


AGWNLIiALD 


EEPHPEHSAV 


PAGLiAATTAL 


VQAIiADNGAD 


1101 


lAVRTLTQGA 


VSTSAGDALT 


HPVQAQVWGL 


GRVAALEYPR 


LWGGLVDLPA 


1151 


RIDHQTLiARL 


AAALVPQDED 


QISIRPSGVH 


ARRLxAHAPAN 


TVGSGLGWRP 


1201 


DGTTLITGGT 


GGIGAVLARW 


LARAGAPHLL 


LTSRRGPDAP 


GAQEIiAAELT 


1251 


ELGAAVTVTA 


CDVGDREQVR 


RLIDDVPAEH 


PLTAVIHAAG 


VPNYIGLGDV 


1301 


SGAELDEVLR 


PKAIiAAHHLH 


ELTRELPLSA 


FVMFSSGAGV 


WGSGQQGAYG 


1351 


AAMHFLDALiA 


EHRRAEGLPA 


TS I AWGPWAE 


AGMAADQAAL 


TFFSRFGLHP 


1401 


LSPELCVKAL 


QQALDAGETT 


LTVANFDWAQ 


FTSTFTAQRP 


SPLLADLPEN 


1451 


RRASAPAAQQ 


EDATEASSLQ 


QELTEAKPAQ 


QRQLLLQHVR 


SQAAATLGHS 


1501 


DVDAVPATKP 


FQELGFDSLT 


AVELRNRLNK 


STGLTLPTTV 


VFDHPTPDAL 
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1551 


TDVLRAELSG 


DAAASADPVR 


AAGASRGAAD 


DEPIAIVGMA 


CRYPGDVRSA 


1601 


EELWDLVAAG 


KDAMGAFPDD 


RGWDLETLYD 


PDPESRGTSY 


VREGGFLYDA 


1651 


GDFDAGFFGI 


SPREAVAMDP 


QQRLLLETAW 


EAIERAGLDR 


ETLKGSDAGV 


1701 


FTGLTIFDYL 


ALVGEQPTEV 


EGYIGTGNLG 


CVASGRVSYV 


LGLEGPAMTI 


1751 


DTGCSSSLVA 


IHQAAHALRQ 


GECSLALAGG 


ATVMATPGSF 


VEFSLQRGLA 


1801 


KDGRCKPFAA 


AADGTGWAEG 


VGLWLERLS 


EARRNGHNVL 


'J^VIRGSAINQ 


1851 


DGTSNGLTAP 


NGQAQQRVIR 


QALANARLSA 


EDVDAVEAHG 


TGTMLGDPIE 


1901 


ASALVATYGK 


ERPADRPLWL 


GSIKSNIGHA 


QASAGVAGVI 


KMVMALRNEQ 


1951 


LPASLHIDAP 


TPHVDWDGSG 


VRLLSEPVSW 


PRGERPRRAG 


VSAFGISGTN 


2001 


AHLILEQAPD 


APEPVTAPAE 


DAAAPAGWP 


WWSARGEEA 


LRAQARLIiAD 


2051 


RATADPRIAS 


PLDVGWSLVK 


TRSVFENRAV 


WGKDRQTLL 


AGLRSLAAGE 


2101 


PSPDWEGAV 


QGASGAGPVL 


VFPGQGSQWV 


GMGAQLLDES 


PVFAARIAEC 


2151 


ERALSAHVDW 


SLSAVLRGDG 


SELSRVEWQ 


PVLWAVMVSL 


AS WAD YG IT 


2201 


PAAVIGHSQG 


EMAAACVAGA 


LSLEDAARIV 


AVRSDALRQL 


MGQGDMASLG 


2251 


AGSEQVAELI 


GDRPGVCVAA 


VNGPSSTVIS 


GPPEHVAAW 


ADAEARGLRA 


2301 


RVIDVGYASH 


GPQIDQLHDL 


LTERLADIRP 


TTTDVAFYST 


VTAERLDDTT 


23 51 


TLDTDYWVTN 


LRQPVRFADT 


lEALLADGYR 


LFIEASPHPV 


LNLGMEETIE 


2401 


RADMPATWP 


TLRRDHGDAA 


QLTRAAAQAF 


GAGAEVDWTG 


WFPAVPLPRV 


2451 


VDLPTYAFQR 


ERFWLEGRRG 


liAGDPAGLGL 


ASAGHPLLGA 


AVELADGGSH 


2501 


LLTGRISPRD 


QAWIiAEHRVM 


DTVLLPGSAF 


VELALQAAVR 


AGCAELAELT 


2551 


LHTPLAFGDE 


GAGAVDVQW 


VGSVAEDGRR 


PVTVHSRPTG 


EGEEAVWTRH 


2601 


AAGWAPPGP 


DAGDASFGGT 


WPPPGATPVG 


EQDPYGELAS 


YGYDFGPGSQ 


2651 


GLVSAWRLGD 


DLFAEVALPE 


AESGRADRYQ 


VHPVLLDATL 


HALILDAVTS 


2701 


SADTDQVLLP 


FSWSGLRVHA 


PGAEKLRVRI 


ARTAPDQIiAL 


TAVDGGGGGE 
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2751 


PVLTLESLTV 


RPVAAHQIAG 


ARAADRDALF 


RLVWMEVAAR 


AEETGGGAPR 


2801 


AAVLAPVESG 


PMGGTSAGAL 


ADALSDALAA 


GPVWDTFGAL 


RDGVAAGGEA 


2851 


PDWLAVCAA 


PGAGAGAVAD 


ADGRGGDPAG 


YARLATVSLL 


SLLKEWVDDP 


2901 


AFAATRLWV 


TRGAVAARPG 


ETAGDLAGAS 


LWGLVRSAQA 


ENPGRLTLLD 


2951 


VDGLESSPAT 


LTGVLASGEP 


EIiALRDGRAY 


VPRLVRDDAS 


VRLVPPVGSL 


3001 


TWRLARCQEA 


GGGQQLSLVD 


APEAGRALEP 


hevrvavratT'apgpltagqv 


3051 


EGAGWTEVG 


GEVGSVAVGD 


RVMGLFDAVG 


PVAVTDAALL 


MPVPAGWSWA 


3101 


QAAGSLGAYV 


SAYHVLADVV 


APRGGETLLV 


geetgsvgra 


VLRLAIiAGRW 


3151 


RVEAVDGAST 


ADDSGAERAA 


DVTLRHEGAL 


WHRAGGRPD 


EGQAWPPEP 


3201 


GRVREILAEL 


TELTELAEIT 


ESAEPGLPAE 


RGDSRALTPL 


DITVWDIRQA 


3251 


PAAMAAPPSA 


GTTVFSLPPA 


FDPEGTVLVT 


GGTGALGSLT 


ARHLVERYGA 


3301 


RHLLLSSRRG 


ADAPGALELA 


ADLSALGARV 


TFAACDPGDR 


DEAAALLAAV 


3351 


PSDHPLTAVF 


HCAGTVNDAV 


VQNLTAEQVE 


EVMRVKADAA 


WHLHELTRDA 


3401 


DLSAFVLYSS 


VAGLLGGPGQ 


GSYTAANAFL 


DAIiARHRHDG 


GAAATSIiAWG 


3451 


YWELASGMSG 


RLTDADRARH 


ARAGWGLGA 


DEGIiALLDAA 


WAGGLPLYAP 


3501 


VRLDLARMRR 


QAQSHPAPAL 


LRDLVRGGSK 


SGGGAVSAGA 


AALLKSLGAM 


3551 


SDPEREEALL 


DLVCTHIAAV 


LGYDAATPVN 


ATQGLRELGF 


DSLTAVELRN 


3601 


RLSAATGLKL 


PATFVFDHPN 


PAELAAQLRQ 


ELAPRAADPL 


ADVIiAEFERI 


3651 


EDSLLSVSSK 


DGSARAELiAG 


RLRATLiARLD 


APQDTAGEVA 


VATRTRIQDA 


3701 


SADEIFAFID 


RDLGRDGASG 


QGNGQPTGQG 


NGHGNGNGNG 


NGNGHGQAVE 


3751 


GQR* 











MonAVII, polyketide synthase multi-enzyme MONS7, housing extension 
module 10 Length: 1642 amino acids 

1 MAHTEEKIiLE YLKRVTADLR QTERRLQDVE SAGHEPVAVI GMACRLPGGV 
51 RSPEEFWELV STGGDAVAPL PGNRNWDLDS LYDPDPESTG TSYVREGGFV 
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101 


YDAGDFDPTF 


FGIGPTEAAA 


MAPQQRLALE 


TAWEAIERAG 


IDPLSLRSSD 


151 


TSTFIGCDGL 


DYALGASEVP 


EGTAGYFTIG 


NSGSVTSGRV 


AYTLGLEGPA 


201 


VTVDTACSSS 


LVSLHLATQA 


LRTQECSLAL 


AGGTYVMSSP 


APLIGFSELR 


251 


GLAPDGRCKP 


FSASSDGMGM 


AEGTGWLLE 


RLSDARRKGH 


KVLAVIRGSA 


301 


INQDGASNGL 


TAPNGPAQER 


VIRAALANAR 


liAPEDIDAVE 


AHGTGTTLGD 


351 


PIEAGALISA 


YGRERPEDRP 


LWVGAVKSNI 


GHTQIAAGVA 


^IKMVLALR 


401 


HDLLPAILHV 


DAPSPHVEWD 


GSGLRLLTDP 


VKWPRGERPR 


RAGVSSFGFS 


451 


GTNAHLILEE 


APPEEEDVPCf 


SVAEEPGGW 


PWWSGRTPD 


ALRAQARRLG 


501 


EFAAGPADAS 


AADVGWSLTT 


TRSVFEHRAV 


WGRDRDALT 


AGLGALAAGE 


551 


ASAGWAGVA 


GDVGPGPVLV 


FPGQGSQWVG 


MGAQLLDESP 


VFAARIAECE 


601 


RALSAYVDWS 


LSAVLRGDGS 


ELSRVEWQP 


VLWAVMVSLA 


AVWADYGVTP 


651 


AAVIGHSQGE 


MAAACVAGAL 


SLEDAARIVA 


VRSDALRRLQ 


GHGDMASLST 


701 


GAEQAAELIG 


DRPGVWAAV 


NGPSSTVISG 


PPEHVAAWA 


DAEARGLRAR 


751 


VIDVGYASHG 


PQIDQLHDLL 


TERIiADIRPA 


NTDVAFYSTV 


TAERLTDTTA 


801 


LDTDYWVTNL 


RQPVRFADTI 


EALLADGYRL 


FIEASAHPVX. 


GLGMEETIEQ 


851 


ADIPATWPT 


LRRDHGDTTQ 


LTRAAAHAFT 


AGAPVDWRRW 


FPADPTPRTV 


901 


DliPTYAFQHQ 


HYWLERSASA 


SGAVSGEQSA 


AEAQLWHAVE 


ELDLGLLAET 


951 


LGSEEGSEEA 


VRALEPALPV 


LKGWRRRHQD 


QATIDSWRYR 


VTWKQRSDGP 


1001 


APELGGDWLL FVPADKAEHP AVRATAEALS EHGAAAVRLH PVETGRAGRQ 



1051 ELAAVDTAGL AGIVNLLALD EEPHPEHPAV PAGLAATTAL LQALGDNGTT 

1101 APLHTVTQGA VSTGATDPLT HPLQAHVWGL GRVAALEHPR LWAGLVDLPA 

1151 RIDRHTLPRL AAALLPQDDE DQTAVRPTGI HHRRLTHAVG SIQNPVHSEA 

12 01 TWRPRGTTLI TGGTGGIGAV LARWLARQGA PRLHLTSRRG PDAPGARELA 

12 51 AELDGLGTAV TITACDVSDP RQLSGLIDDM PAEHPLTAVI HAAGMTDLTA 
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13 01 IGDLTTARLG EVLGSKSDAA WNLHELTRDL DLSAFVMFSS GAGVWGSGQQ 

13 51 GAYGAANHFL DALAEHRRAQ GLPATSIAWG PWAEAGMSAD PESLTYFKRF 

14 01 GLLPIAPDLC VKALHQAVDA GDATLTVANF DWAKFTPTFT AQRPSPFLDD 
14 51 LPENQREAEQ TGTAAETSAF REELAKTPAS QRLGFLVQQV RTYAAATLGR 
1501 TVEDIPAAKP FQELGFDSLT AVQLRNQLNT TTGLSLPATV IFDHPTPEAL 
1551 ATHLRGQLGD GAEVAGEGDV LAALDKWDTA FGAAEVDEAA "kRRIVGRLQV 
1601, LVSKWSPAQD GPEGTDSAHA DLEAASADDI FDLISSEFGK S* 

MonD, cytochrome P450 hydroxylase Length: 431 amino acids 

1 VGLTVGPDNA KRGIVPITDS KPAATFPDLV DPSFWARPHA ERVALFEEMR 

51 GLPRPAFIRQ NMPGVPWTFG YHALVKYADI VEVSRRPQDF SSNGATTIIG 

101 LPPELDEYYG SMINMDNPEH SRLRRIVSRS FGRNMIPEFE AVATRTARRI 

151 IDELIARGPG DFIRPVAAEM PIAVLSDMMG IPAEDHDFLF DRSNTIVGPL 

201 DPDYVPDRAD SERAVIEASR ELGDYIAGLR AERLAAPGND LITKLVQVQA 

251 DGEQLTRQEL VSFFILLVIA GMETTRNAIS HALVLLTEHP EQKQLLLSDF 

3 01 DTHAPNAVEE ILRVSTPINW MRRVATRDCD MNGHRFRRGD RIFLFYWSGN 

3 51 RDESVFPDPY RFDITRGTNA HVTFGAVGPH VCLGAHIjARM EITVLYRELL 

401 AALPQIHAVG QPRRLDSSFI EGIKHLHCAF * 

MonRI, probable activator protein Length: 268 amino acids 

1 VRYEMLGPLR IKDGNDYATI NAQKVEIVLT VLLIRADRW SLEQLMREIW 

51 GEDLPRRATA GLHVYISQLR KFLKVPGSAG NPVETRAPGY VLHKRDDDQI 

101 DAQIFPELVD VGRSLLREKR FDEAASCFGQ ALALWRGPIL GQGGNGPGTN 

151 GPIIDGFSTW LTEIRLECQE MLVECQLQLG RHREAVGMLY ALTAENPMCE 

201 AFYRQLMLAL YRSERQADAL KVYQSVRKTL NDELGLEPGR PLQELQRAIL 

251 AGDMHLMSPP PLALSGR* 
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MonAX, thioesterase Length: 278 amino acids 

1 LSAFLAKGKI LSAFPPPDMS DPWIRRFRPR PEAWRLVCF PHAGGSASYY 

51 HPLAQSPTLP TDSEVLAVQY PGRQDRRRER LLDDIGELAD LITDALGPFD 

101 DRPLAFFGHS MGAVLAYEVA QRLRERTGKQ PCRLFVSGRR APSRFRRGTV 

151 HLLDDTEIiAA ELRRAGGTDP RFLDDEELLA EIIPWRNDY RAVELYRWNP 

2 01 SPPLSCPITA LVGDRDPQAP LDEVEAWQQH TEGPFDLKVF AGGHFYLNTH 

2 51- QQGVTEVISK ALADSAQQRA TARGNAR* 

ORF29, a homologue of CapK involved in cell wall biosynthesis Length: 428 
amino acids 



1 


LADLVAHARS 


ASPYYRELYH 


GLPERIEDPT 


LLPVTDKKQL 


MDHFDDWPTD 


51 


RDITFEKVRA 


FTDDPELIGR 


RFLGRYLVAT 


TSGTSGRRGL 


FVLDDRYMNV 


101 


SSAVSSRVIiA 


SWLGPLGIAR 


AWHGGRFAQ 


LVATEGHYVG 


FAGYSRLRQD 


151 


GEARSKLVRA 


FSVHEPMSRL 


VAELNEYRPA 


FVIGYASTIM 


LFTAEQEAGR 


201 


LHIDPVLVEP 


AGETMTESDT 


DRIAAAFGAK 


VRTMYSATEC 


TYLSHGCAEG 


251 


WYHVNDDWAV 


LEPVDADHRP 


TPPGEFSHTT 


LISNLANRVQ 


PFLRYDLGDS 


301 


VMLRPDPCPC 


GTPSPAIRVQ 


GRSGDILTFP 


SGRGDDVSLA 


PLAFSSLFDR 


351 


MPGVELFQIE 


QTAPSTLRVR 


WQAPGADAD 


HVWQRAHDGL 


THLLADNKLD 


401 


NVTVERGEEP 


PRQASGGKYR 


TIIPIiAA* 






LipB, lipase B Length: 338 amino acids 






1 


VKVPVEVTVR 


LSSWLGGLVA 


AVLAATVLPA 


SAASAADVSS 


PPLEIPAAEL 


51 


AKALHCGTEL 


GDLRDAGDKP 


TVLFVPGTGL 


KGEENYAWNY 


MAELKKKGYQ 


101 


SCWVDSPGRG 


LRDMQESVEY 


WYATRAIQE 


ATGRKVDLVG 


HSQGGLLTAW 


151 


ALRFWPDLPG 


KVDDMVTLGS 


PFQGTRIiASP 


CRPIAEVAGC 


PASVLQFARD 


201 


SNWSKALGAD 


GTPMPAGPSY 


TTIYSYADES 


WADGEAPSL 


PGAHRIGVQD 
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251 ICPGRPWPTH lAMWDQVSY DLVADAIEHP GPADTSRIDR AHCAKPVMPL 
301 NSQEAVDALP GLLNFPIELL IHSQPWVDEE PPLRPYAR 

ORF31, putative ion pump Length: 309 amino acids 

1 MGHDHGPSAG AAGGTLSGTY RKRLLWTIGI SGSITVIQW GALLSGSLAL 

51 LADAAHSLTD AVGVSLALGA ITLAQRAPTP RRTFGFCRVE IFSAVLNALL 

101 LWIFAWVLW SAIGRFSEPV EVKGGLMFW ALGGLAANLV GLWLLRDAKE 

151" KSLNLRGAYL EVLGDALGSV AVIVGGLVIL LTGWQAADPI ASIVIGLLIV 

201 PRAYGLLRDS LHVLLEATPQ DVDLGEVRRH LLEERGWAV HDLHGWTVTS 

251 GMPVLTAHW VTEEALASGY GELLGRLQRC VGGHFDVAHS TIQLEPEGHV 

301 EEDGALHT* 

ORF32, hypothetical membrane protein Length: 364 amino acids 

1 MTRALTLHDW IVAGIAWAG WAGLLLRAL LRWLGERASK TRWSGDDVIV 
51 DALRTLVPCA AITAGLAAAA GALPLTPRTG RNVTMTLTAL LILAATLTAA 
101 RIVTGLVKAV AQSRSGVAGS ATIFVNITRV WLAMGFLIV LQTLGISIAP 
151 LLTALGVGGL AVALALQDTL ANLFAGVHIL AAKTVQPGDY IQLSSGEEGY 

2 01 WDINWRNTT VRQLSNNLVI IPNAKLAGTN MTNYSRPEQE LSIMVQVGVS 
251 YDSDLEQVEK VTTEWDEVM AEITGAVPDH EAAIRFHTFG DSRISFTVIL 
301 GVGEFSDQYR IKHEFIKRLH QRYRAEGIRV PAPVRTVRVQ QGELPPPLGI 

3 51 PHQRDTSTQA RLH* 

AmtA, glycine amidinotransferase (partial coding sequence) 
Length: 131 amino acids 

1 MSPVNSHNEW DPLEEIIVGR LEGATIPSSH PWACNIPTW AARLQGLAAG . 
51 FEYPQRLIEP AQQELDQFIA LLQSLDVTVR RPAAVDHKHR FGTPDWQSRG 
101 FCNSCPRDSM LWGDEIIET PMAWPCRCFE T 
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